Showing posts with label SQL. Show all posts

Monday, October 21, 2024

Data Modeling Best Practices in ArangoDB

Data modeling is a critical aspect of database design that influences the performance, scalability, and maintainability of your application. In this post, we will explore best practices for data modeling in ArangoDB, focusing on how to leverage its multi-model capabilities effectively.

Understanding the Data Structure

Before we dive into modeling practices, it’s essential to understand the data structure in ArangoDB. ArangoDB supports three primary data models:

Document Model: Ideal for storing unstructured or semi-structured data.
Key-Value Model: Best for simple lookups and caching.
Graph Model: Optimized for handling highly interconnected data.

Best Practices for Document Modeling

1. Use Meaningful Keys
When creating documents, use meaningful keys that reflect the content of the document. For example, use a user’s email as the key for a user document, like so:

json
{
"_key": "john.doe@example.com",
"name": "John Doe",
"age": 28
}

2. Avoid Deep Nesting
While JSON allows for nested structures, avoid deep nesting as it can complicate querying and lead to performance issues. Keep your document structure flat when possible. Instead of this:

json
{
"user": {
    "name": "John",
    "address": {
      "city": "Springfield",
      "zip": "62704"
    }
}
}
Consider this simpler structure:

json
{
"name": "John",
"city": "Springfield",
"zip": "62704"
}

3. Use Arrays Wisely
Arrays are a powerful feature of JSON, but use them judiciously. If you frequently need to query or update elements within an array, consider creating separate documents with relationships instead.

Best Practices for Key-Value Modeling

1. Use Key-Value for Configuration and Settings
For storing application configuration settings, use the key-value model to maintain simplicity and efficiency. For example:

json
{
"_key": "app_config",
"theme": "dark",
"language": "en"
}

Best Practices for Graph Modeling

1. Define Clear Relationships
When modeling relationships in your graph, be explicit about the types of connections between entities. For example, in a social network, define edges like "follows" or "friends" to represent the relationship clearly.

2. Limit Relationship Depth
While graphs allow for traversing multiple levels of relationships, limit the depth of traversals to improve performance. For example, when querying friends of friends, consider limiting the depth to 2 to avoid excessive data retrieval.

Designing Collections and Indexes

1. Group Related Documents
Organize your collections logically. For example, create a users collection for user documents and a separate posts collection for user-generated content. This keeps your data organized and manageable.

2. Create Indexes for Performance
Creating indexes on frequently queried fields can significantly improve query performance. For example, if you frequently search for users by email, create an index on the email field:

sql
CREATE INDEX email_index ON users(email)

Conclusion

Effective data modeling is crucial for maximizing the capabilities of ArangoDB. By following best practices for document, key-value, and graph modeling, you can design a database that is performant, maintainable, and scalable. In the next post, we will explore performance optimization techniques in ArangoDB, including indexing strategies and query optimization.

Wednesday, April 19, 2023

How do you optimize database queries in a .NET Core Web API?

Optimizing database queries in a .NET Core Web API is an important task to improve the performance of the application. Here are some best practices to follow:

Use indexing: Indexing helps to speed up data retrieval from tables. Create indexes on columns that are frequently used in WHERE clauses or joins.
Avoid using SELECT *: Avoid using SELECT * in your queries. Instead, specify only the columns that are needed. This reduces the amount of data that needs to be retrieved and can speed up query execution.
Use parameterized queries: Parameterized queries help to prevent SQL injection attacks and can also improve query performance. They allow database systems to cache query plans, which can be reused for subsequent queries.
Use stored procedures: Stored procedures are precompiled database objects that can be executed with parameters. They can help to reduce network traffic and improve performance by minimizing the amount of data that needs to be sent between the application and the database.
Use database connection pooling: Connection pooling is a technique that allows database connections to be reused. This can help to reduce the overhead of creating and closing database connections, which can improve performance.
Use asynchronous queries: Asynchronous queries allow multiple queries to be executed concurrently, which can improve the performance of the application.
Monitor query performance: Use tools like SQL Server Profiler to monitor the performance of your queries. This can help you to identify slow queries and optimize them.
Optimize data access patterns: Use techniques like lazy loading, eager loading, and caching to optimize data access patterns. This can help to reduce the number of database queries that need to be executed and improve performance.
Use database sharding: If your application is handling a large amount of data, you can consider using database sharding to improve performance. Database sharding involves dividing a large database into smaller, more manageable pieces.

By following these best practices, you can optimize database queries in your .NET Core Web API and improve the performance of your application.

Friday, December 10, 2021

How to split a comma separated string into multiple records using T-SQL function?

Let's say you have a comma separated string like:

split,a,comma,separated,string

Now you need to convert the string to follow rows:

split

comma

separated

string

Solution:

You can create a user defined UDF like the one shown below. Then, just pass in the comma separated list from another query and it will return a table with each value in a separate row. The function used here is a table valued function.

CREATE FUNCTION [dbo].[fnSplitStringToTable]
(
    @input nvarchar(MAX),
   @delimiter char(1) = ','
)
RETURNS
@Result TABLE
(
    Value nvarchar(MAX)
)
AS
BEGIN
    DECLARE @chIndex int
    DECLARE @item nvarchar(MAX)

    WHILE CHARINDEX(@delimiter, @input, 0) <> 0
    BEGIN
        SET @chIndex = CHARINDEX(@delimiter, @input, 0)
        SELECT @item = SUBSTRING(@input, 1, @chIndex - 1)

        IF LEN(@item) > 0
        BEGIN
            INSERT INTO @Result(Value)
            VALUES (@item)
        END
        SELECT @input = SUBSTRING(@input, @chIndex + 1, LEN(@input))
    END
    IF LEN(@input) > 0
    BEGIN
        INSERT INTO @Result(Value)
        VALUES (@input)
    END
    RETURN
END

Here is how you can execute the function and produce results as expected:

Thursday, September 2, 2021

SQL update from one Table to another based on a ID match

SQL update from one Table to another based on a ID match

While working with SQL database you might often require to update one table data based on values of a column value in another table. For this to be done successfully you might require a common id field or foreign key which will match values of both tables to update.

Here is the syntax to perform the task in SQL Server:

UPDATE table1

SET t1.Col = t2.Col

FROM table1 t1

INNER JOIN table2 t2 ON t1.ID = t2.ID;

To demonstrated this let us assume we have following two tables:

Place
===========
PlaceId                int,
name                    nvarchar(200),
ispublished           bit

PlaceDetails
==============
PlaceDetailId       int,
PlaceId                    int,
DetailsDesc            nvarchar(max),
SeoTitle                   nvarchar(max),
SeoDescription        nvarchar(max),
status                      bit

Case 1: Update status field of PlaceDetails from values of ispublished in Place table.

Solution:

UPDATE PlaceDetails

SET d.status = p.ispublished

FROM PlaceDetails d

INNER JOIN Place p ON d.PlaceId = p.PlaceId;

Case 2: Update status field of PlaceDetails from values of ispublished in Place table only for the places that are published.

Solution: You require to add a where condition along with previous query for this purpose.

UPDATE PlaceDetails

SET d.status = p.ispublished

FROM PlaceDetails d

INNER JOIN Place p ON d.PlaceId = p.PlaceId

WHERE p.ispublished='True';

Case 3: Update SeoTitle and SeoDescription field of PlaceDetails for a specific placetypes

Solution: This is a complex implementation of the query that allows you to update a table values based on a complex query result. For this purpose we consider some more tables.

The following query is used to generate SEO Titile and SEO Description information for all of the places of a category:

SELECT p.PlaceId, p.name, 'Book a room in ' + p.name +', '+ c.Name+' | addressschool.com',
p.name +' is a '+ t.Name+' in '+c.Name+', '+n.Name +' that is located in '+d.formatted_address
FROM     Place AS p INNER JOIN
                  PlaceDetail AS d ON p.PlaceId = d.PlaceId INNER JOIN
                  City AS c ON p.CityId = c.Id inner join
                PlaceType t on p.PlaceTypeId = t.Id INNER JOIN
                  Country AS n ON p.CountryId = n.Id
WHERE (p.PlaceTypeId = 1)

Now we use the above query as a inline view in the update statement to update SeoTitle and SeoDescription information in PlaceDetails table:

UPDATE
    PlaceDetail
SET
    PlaceDetail.name = SR.SeoTitle,
   PlaceDetail.url = SR.SeoDescription
FROM
    PlaceDetail pd
INNER JOIN
    (SELECT p.PlaceId, p.name, 'Book a room in ' + p.name +', '+ c.Name+' | addressschool.com' as SeoTitle,
   p.name +' is a '+ t.Name+' in '+c.Name+', '+n.Name +' that is located in '+d.formatted_address as SeoDescription
   FROM     Place AS p INNER JOIN
                      PlaceDetail AS d ON p.PlaceId = d.PlaceId INNER JOIN
                      City AS c ON p.CityId = c.Id inner join
                      PlaceType t on p.PlaceTypeId = t.Id INNER JOIN
                      Country AS n ON p.CountryId = n.Id
   WHERE (p.PlaceTypeId = 1)) SR
ON
    pd.PlaceId = SR.PlaceId;

Hope this will help you.

Any query and comments will be appreciated and answered.

Monday, April 19, 2021

What is SQL? Explain DQL, DML, DDL, DCL and TCL statements with examples.

Learn about SQL and categories of SQL commands like DQL, DML, DDL, DCL and TCL statements with specific command examples.

SQL & DQL, DML, DDL, DCL and TCL statements

SQL:

SQL stands for Structured Query Language. It is the language for database that helps retrieving, storing, and manipulation of data in a relational database.

All RDBMS like Oracle, SQL Server, MySQL, PostgresSQL, DB2 etc uses SQL as database language. NoSQL database don't use it.

SQL uses certain commands like Select, Create, Insert, Update, Drop, Truncate etc. to carry out the specific tasks in the database.

These SQL commands are primarily categorized into following categories:

DDL – Data Definition Language
DQL – Data Query Language
DML – Data Manipulation Language
DCL – Data Control Language
TCL – Transaction Control Language

Data Definition Language (DDL)

Data Definition Language or DDL commands are comprised of the SQL commands that are used to define the database schema and objects like tables, index, procedures, triggers etc.

DDL commands are auto-committed, meaning that changes saved permanently in the database.

Examples of DDL commands:

CREATE,
DROP,
ALTER,
TRUNCATE,
COMMENT,
RENAME

Data Query Language (DQL) :

Data Query Language or DQL statements are used to retrieve data from database tables. It is also in some form works for inserting data.

Example of DQL commands:

SELECT

Data Manipulation Language(DML):

DML SQL commands are used to manipulate data in the database.

DML commands are not auto-committed, meaning that changes have to committed explicitly and it can be roolback

Examples of DML commands:

INSERT
UPDATE
DELETE

Data Control Language(DCL):

DCL commands deal with the rights, permissions and other controls of the database system.

Examples of DCL commands:

GRANT
REVOKE

Transaction Control Language(TCL):

TCL commands are used to manage the transaction at the database level. These operations are automatically committed in the database.

Examples of TCL commands:

COMMIT
ROLLBACK
SAVEPOINT
SET TRANSACTION

Wednesday, April 7, 2021

What is the difference between primary key and unique key in SQL Server?

How to answer DBMS interview question about the difference between primary key and unique key

difference between primary key and unique key

This is very commonly asked interview question from SQL Server or database. Following is the answer of this question:

1. By default, Primary key creates clustered index in the table but unique key creates non-clustered index.

2. Primary key column does not accept any null values, where as a unique key column accept only one null value.

3. A table can have only one primary key. On the other hand a table can have more than one unique key.

4. Duplicate values are not allowed in primary key where as duplicate value will be accepted if one or more key parts are null

5. The purpose of implementing primary key is to enforce integrity between entities of database, on the other hand the purpose of unique key is to enforce unique data within the table.

Tuesday, July 21, 2020

How to solve ASP.NET MVC Exception: Timeout expired. The timeout period elapsed prior to completion of the operation or the server is not responding. The statement has been terminated

ASP.NET MVC can give this error probably when you have a long running query. It can happen when:

1. You are running query on a very large dataset

2. Your query is complex and inefficient

3. Database statistics and/or query plan cache are incorrect

4. You are in deadlock

You should be able to find out the specific problem you are facing from stack trace.

Solutions:

Step 1:

If your query needs more than the default 30 seconds, you might want to set the CommandTimeout higher. To do that you'll change it after you instantiated the DataAdapter on the SelectCommand property of that instance, like so:

        protected DataTable GetDataTable(string Query, CommandType cmdType, params SqlParameter[] parameters)
        {
            string strCon = DataContext.Database.Connection.ConnectionString;
            DataTable dt = new DataTable();
            using (SqlConnection con = new SqlConnection(strCon))
            {
                using (SqlCommand cmd = new SqlCommand(Query, con))
                {
                    cmd.CommandType = cmdType;
                    cmd.Parameters.AddRange(parameters);
                    using (SqlDataAdapter da = new SqlDataAdapter(cmd))
                    {
                        da.SelectCommand.CommandTimeout = 60 * 10;
                        da.Fill(dt);
                    }
                }
            }
            return dt;
        }

Step 2:

To reduce query complexity you need to apply different efficient methods to rewrite query as there is no specific method for improving query performance. You can apply indexing if you have large datasets. DBCC command is helpful to re-index and rebuild indexes which can also improve if you have large datasets.

DBCC DBREINDEX ('HumanResources.Employee', ' ');

Step 3:

Improve database table query plan statiscs for all tables and index with a single command:

exec sp_updatestats

If that doesn't work you could also try

dbcc freeproccache

Hope this can be helpful.

Thursday, March 12, 2020

How to get random rows in SQL? SQL Query to get Random data. Top N Random Records.

There are a number of ways to get Random data from database. Following are some examples that works without any extra algorithm:

SQL Server:

SELECT TOP 20 *
FROM table_name
ORDER BY NEWID()

MySQL:

SELECT *
FROM table_name
ORDER BY RAND()
LIMIT 20

PostgreSQL:

SELECT *
FROM table_name
ORDER BY RANDOM()
LIMIT 20

Oracle:

SELECT * FROM
( SELECT *
FROM table_name
ORDER BY dbms_random.value )
WHERE rownum <= 20