Blob Storage
Let us say we are up to designing a Uber system where we are up to the booking, renting cabs, and many other services.
Now here the consumer will be dropping text so as in order to easily communicate or call or worst dropping a nearby images where he/she wants to go. So as a driver can be there either by online call service or dropped image landed for which we will be needing databases to store the images.
Wherever in a system we are having images and videos they are not directly termed as databases and rather are referred to as Blob storages because we are directly putting them. So in order to work, no queries can be operated over them.
The below media depicts the system architecture of the Uber app as follows in order to get showcasing data stores and to get an overview of database design:
Note: Amazon S3 is one of the top providers serving as data storage, also known as blob storage.
CDN(Content Delivery Network): Images and videos that are stored in datastores (Amazon S3)now need to be widespread across different servers across the globe because there are humongous users that are querying for the same as compared to databases widespread across in accordance to geographical locations.
Blob = S3(Datastore) + CDN(Content Delivery Network)
Text Search Engine Capabilities: Now we want users to interact with the service then we need to provide searching via text of title and description. Here in order to aid searching common search engine capabilities which are seen in many system architectures such as Google Maps, Amazon Prime, and many more that we can easily think of. Now, this commonly is provided by Elastic search and Solr of which both are built on top of Apache Lucene.
Now think of what will happen if the user enters grammatically wrong words while searching how a text search engine will be working is as shown below as follows:
Fuzzy Search: It is assistance played alongside text search capabilities where the user enters grammatically wrong while typing corresponding if we do not fetch any query result will lead to bad user capabilities. Hence we make our database smart with this technique by showcasing nothing.
Now let us consider a sample system design be it Google, or Amazon where we want to store the database for analytics on all the transactions as per the system’s current or ongoing requirements then we do it a different way. Here we keep a larger chunk over all databases which is commonly known as data warehousing.
Data Warehousing: It is introduced where all the data is dumped into the database so as to serve various queering capabilities to generate reports. Data warehousing is generally not computed over online data but computed over offline data.
Example: Hadoop
Complete Reference to Databases in Designing Systems – Learn System Design
Previous Parts of this System Design Tutorial