Friday, December 10, 2021

How to split a comma separated string into multiple records using T-SQL function?

 Let's say you have a comma separated string like: 

split,a,comma,separated,string

 

Now you need to convert the string to follow rows:

split
a
comma
separated
string

 

Solution:

You can create a user defined UDF like the one shown below. Then, just pass in the comma separated list from another query and it will return a table with each value in a separate row. The function used here is a table valued function.

 

CREATE FUNCTION [dbo].[fnSplitStringToTable]
(
    @input nvarchar(MAX),
    @delimiter char(1) = ','
)
RETURNS
@Result TABLE
(
    Value nvarchar(MAX)
)
AS
BEGIN
    DECLARE @chIndex int
    DECLARE @item nvarchar(MAX)

    WHILE CHARINDEX(@delimiter, @input, 0) <> 0
    BEGIN
        SET @chIndex = CHARINDEX(@delimiter, @input, 0)
        SELECT @item = SUBSTRING(@input, 1, @chIndex - 1)

        IF LEN(@item) > 0
        BEGIN
            INSERT INTO @Result(Value)
            VALUES (@item)
        END
        SELECT @input = SUBSTRING(@input, @chIndex + 1, LEN(@input))
    END
    IF LEN(@input) > 0
    BEGIN
        INSERT INTO @Result(Value)
        VALUES (@input)
    END
    RETURN
END

 

Here is how you can execute the function and produce results as expected:

 

 


Tuesday, November 30, 2021

What are the functions of Data Mining?

Data mining is defined as the process of discovering patterns in data. The process must be automatic or (more usually) semiautomatic. The patterns discovered must be meaningful in that they lead some advantage, usually an economic one.


Data mining have two major functions: 

1. Classification:

  • Classification maps data into predefined groups or classes. 
  • It is often referred to as supervised learning because the classes are determined before examining the data. 
  • Classification creates a function from training data. The training data consist of pairs of input objects, and desired output. The output of the function can be a continuous value, or can predict a class label of the input object. 
  • The task of the classification is to predict the value of the function for any valid input object after having seen only a small number of training examples.

 2. Clustering:

  • Clustering is similar to classification except that the groups are not predefined, but rather defined by the data alone. 
  • Clustering is alternatively referred to as unsupervised learning or segmentation. 
  • It can be thought of as partitioning or segmenting the data into groups that might or might not be disjointed. 
  • The clustering is usually accomplished by determining the similarity among the data on predefined attributes. The most similar data are grouped into clusters.




What is Machine Learning? What are different types of Machine Learning?

Machine Learning:

Machine learning provides the technical basis of data mining. It is a branch of artificial intelligence, which concerns the construction and study of systems that can learn from data.

For example, a machine learning system could be trained on email messages to learn to distinguish between spam and non-spam messages. After learning, it can then be used to classify new email messages into spam and non-spam folders.


Types of Machine Learning:

 
Supervised learning is basically a synonym of classification. The supervision in the learning comes from the labeled instances in the training data. 

Unsupervised learning is essentially a synonym of clustering. The learning process is unsupervised since the input instances are not class labeled. 

Semi-supervised learning is a class of machine learning technology that make use of both labeled and unlabelled instances when learning a model. 

Active learning is a machine learning approach that lets users play an active role in the learning process. An active learning approach can ask a user (e.g., a domain expert) to label an instance, which may be from a set of unlabelled instances.

What is Data Mining? What Kinds of Data can be Mined?

Data mining is also known as Knowledge Discovery from Data, or KDD for short, which turns a large collection of data into knowledge. Data mining is a multidisciplinary field including machine learning, artificial intelligence, pattern recognition, knowledge-based systems, high-performance computing, database technology, and data visualization. 

  • Data mining is the process of analyzing data from different perspectives and summarizing it into useful information. 
  • Data mining is the process of finding hidden information and patterns in a huge database. 
  • Data mining is the extraction of implicit, previously unknown, and potentially useful information from data.

 

Data, Information, and Knowledge
Data: Data are any recorded facts, numbers, or text that can be processed by a computer - scientific data, medical data, demographic data, financial data, and marking data.

Information: The patterns, associations, or relationships among all this data can provide information.

Knowledge: Information can be converted into knowledge about historical patterns and future trends.

 

What Kinds of Data can be Mined?
The most basic forms of data for mining are come from: 

  1. Database Data 
  2. Data Warehouses 
  3. Transactional Data


What is method overloading?

Method overloading is a polymorphism technique that allows us to have same method in different ways. Overloaded method can have different number of parameters, different parameter types, and different arrangement of parameters but with same name.

What is an interface? When to use interface over abstract class?


An interface contains definitions for a group of related functionalities that a class or a struct can implement.

Use interface over abstract class:

a. If the functionality you are creating will be useful across a wide range of disparate objects, use an interface.

b. If you are designing small, concise bits of functionality, use interfaces. If you are designing large functional units, use an abstract class.

Thursday, September 2, 2021

SQL update from one Table to another based on a ID match

SQL update from one Table to another based on a ID match

While working with SQL database you might often require to update one table data based on values of a column value in another table. For this to be done successfully you might require a common id field or foreign key which will match values of both tables to update.

Here is the syntax to perform the task in SQL Server:

UPDATE table1

SET t1.Col = t2.Col

FROM table1 t1

INNER JOIN table2  t2 ON t1.ID = t2.ID;

 

To demonstrated this let us assume we have following two tables: 


Place
===========
PlaceId                 int,
name                     nvarchar(200),
ispublished           bit


PlaceDetails
==============
PlaceDetailId          int,
PlaceId                     int,
DetailsDesc             nvarchar(max),
SeoTitle                   nvarchar(max),
SeoDescription         nvarchar(max)
,
status                       bit


Case 1: Update status field of PlaceDetails from values of ispublished in Place table.

Solution:

UPDATE PlaceDetails

SET d.status = p.ispublished

FROM PlaceDetails d

          INNER JOIN Place p ON d.PlaceId = p.PlaceId;


Case 2: Update status field of PlaceDetails from values of ispublished in Place table only for the places that are published.

Solution: You require to add a where condition along with previous query for this purpose.

UPDATE PlaceDetails

SET d.status = p.ispublished

FROM PlaceDetails d

          INNER JOIN Place p ON d.PlaceId = p.PlaceId 

         WHERE p.ispublished='True';


Case 3: Update SeoTitle and SeoDescription field of PlaceDetails for a specific placetypes

Solution: This is a complex implementation of the query that allows you to update a table values based on a complex query result. For this purpose we consider some more tables. 

The following query is used to generate SEO Titile and SEO Description information for all of the places of a category:

SELECT p.PlaceId, p.name, 'Book a room in ' + p.name +', '+ c.Name+' | addressschool.com',
p.name +' is a '+ t.Name+' in '+c.Name+', '+n.Name +' that is located in '+d.formatted_address
FROM     Place AS p INNER JOIN
                  PlaceDetail AS d ON p.PlaceId = d.PlaceId INNER JOIN
                  City AS c ON p.CityId = c.Id inner join
                  PlaceType t on p.PlaceTypeId = t.Id INNER JOIN
                  Country AS n ON p.CountryId = n.Id
WHERE  (p.PlaceTypeId = 1)

 

Now we use the above query as a inline view in the update statement to update SeoTitle and SeoDescription information in PlaceDetails table:

 UPDATE
    PlaceDetail
SET
    PlaceDetail.name = SR.SeoTitle,
    PlaceDetail.url = SR.SeoDescription
FROM
    PlaceDetail pd
INNER JOIN
    (SELECT p.PlaceId, p.name, 'Book a room in ' + p.name +', '+ c.Name+' | addressschool.com' as SeoTitle,
    p.name +' is a '+ t.Name+' in '+c.Name+', '+n.Name +' that is located in '+d.formatted_address as SeoDescription
    FROM     Place AS p INNER JOIN
                      PlaceDetail AS d ON p.PlaceId = d.PlaceId INNER JOIN
                      City AS c ON p.CityId = c.Id inner join
                      PlaceType t on p.PlaceTypeId = t.Id INNER JOIN
                      Country AS n ON p.CountryId = n.Id
    WHERE  (p.PlaceTypeId = 1)) SR
ON
    pd.PlaceId = SR.PlaceId;

 

 Hope this will help you.

Any query and comments will be appreciated and answered.

 

 

Monday, April 19, 2021

What is SQL? Explain DQL, DML, DDL, DCL and TCL statements with examples.

Learn about SQL and categories of SQL commands like DQL, DML, DDL, DCL and TCL statements with specific command examples.

SQL & DQL, DML, DDL, DCL and TCL statements
 

SQL: 

SQL stands for Structured Query Language. It is the language for database that helps retrieving, storing, and manipulation of data in a relational database.

All RDBMS like Oracle, SQL Server, MySQL, PostgresSQL, DB2 etc uses SQL as database language. NoSQL database don't use it.

SQL uses certain commands like Select, Create, Insert, Update, Drop, Truncate etc. to carry out the specific tasks in the database.


These SQL commands are primarily categorized into following categories:

  1. DDL – Data Definition Language
  2. DQL – Data Query Language
  3. DML – Data Manipulation Language
  4. DCL – Data Control Language
  5. TCL – Transaction Control Language

Data Definition Language (DDL)

Data Definition Language or DDL commands are comprised of the SQL commands that are used to define the database schema and objects like tables, index, procedures, triggers etc. 

DDL commands are auto-committed, meaning that changes saved permanently in the database. 

Examples of DDL commands:

  • CREATE, 
  • DROP, 
  • ALTER, 
  • TRUNCATE, 
  • COMMENT,
  • RENAME 

Data Query Language (DQL) :

Data Query Language or DQL statements are used to retrieve data from database tables. It is also in some form works for inserting data.

Example of DQL commands:

  • SELECT

 

Data Manipulation Language(DML): 

DML SQL commands are used to manipulate data in the database. 

DML commands are not auto-committed, meaning that changes have to committed explicitly and it can be roolback

Examples of DML commands:

  • INSERT
  • UPDATE
  • DELETE

 

Data Control Language(DCL): 

DCL commands deal with the rights, permissions and other controls of the database system.

Examples of DCL commands:

  • GRANT
  • REVOKE


Transaction Control Language(TCL): 

TCL commands are used to manage the transaction at the database level. These operations are automatically committed in the database.

Examples of TCL commands:

  • COMMIT
  • ROLLBACK
  • SAVEPOINT
  • SET TRANSACTION




Wednesday, April 7, 2021

What is the difference between primary key and unique key in SQL Server?

How to answer DBMS interview question about the difference between primary key and unique key

difference between primary key and unique key

This is very commonly asked interview question from SQL Server or database. Following is the answer of this question:

1. By default, Primary key creates clustered index in the table but unique key creates non-clustered index.

2. Primary key column does not accept any null values, where as a unique key column accept only one null value.

3. A table can have only one primary key. On the other hand a table can have more than one unique key.

4. Duplicate values are not allowed in primary key where as duplicate value will be accepted if one or more key parts are null

5. The purpose of implementing primary key is to enforce integrity between entities of database, on the other hand the purpose of unique key is to enforce unique data within the table.



Saturday, February 20, 2021

Microsoft Power BI interview questions and answers

Learn the most frequently asked questions on Microsoft Power BI in the job interviews in 2021 with answers

 Microsoft Power BI job interviews questions with answers

 

What feature is not available in the free version of Power BI?

  1. Publish content to the Power BI mobile app
  2. Connecting to more than one data source
  3. The Power Query Editor
  4. Creating reports with more than two visualizations

Ans: 1

 

What is the three-step process of data modeling in Power BI?

First, we ________ from a data source. Then, we ________ the data to how we need it, and last, we _________ the data into our Power BI data model.

  1. export, load, transform
  2. load, transform, publish
  3. import, transform, load
  4. load, transform, extract

 Ans: 3

How is Power Query like the human circulatory system? Check all that apply.

  1. consumes data from external data sources
  2. connects automatically to CSV files
  3. loads enriched data into data model
  4. generates new data with synced files
  5. creates data visualizations automatically
  6. transforms and enriches data

Ans: 1,3,6

You can manipulate datasets using Python or R in Power Query.

  1. True
  2. False

 Ans: 1


Once you make a change to your dataset in PowerQuery, you do not have the ability to view how the dataset looked prior to that transformation.

  1. True
  2. False

 Ans: 2


Which of the following terms describes a connection to a data source for the purpose of gathering information?

  1. CSV file
  2. Promoted Headers
  3. Query
  4. Query Settings
  5. None of these

  Ans: 3

 Select each of the following functions which can you perform in Power Query. Check all that apply.

  1. Remove columns
  2. Build visualizations
  3. Rename columns
  4. Perform calculations
  5. View all transformations

  Ans: 1,3,4,5

 We must click "Close & Apply" to leave Power Query and return to Power BI Desktop

  1. True
  2. False

 Ans: 1

Power BI visualizations are only interactive once we enable the feature in Power BI Desktop.

  1. True
  2. False

 Ans: 2, Most visualizations are interactive by default in Power BI, and we do not have to take any extra steps for it.

To create a new visualization, we must click out of an already selected chart before selecting a new chart.

  1. True
  2. False
 Ans: 1, If we keep a chart selected and attempt to add a new chart, it will change the selected chart into the new chart we have clicked. To avoid this, it is very important to learn to first UN-SELECT the chart we are working in before adding a new chart to the report canvas.


Which of the following is not a correct way to add a field to a visualization?

  1. Drag and drop from fields pane onto the visualizations pane
  2. Click the field's checkbox in the fields pane
  3. Double-click the field in the fields pane
  4. Drag and drop from fields pane onto the visualization in the report body

 Ans: 3 


What are three types of filters?
  1. Visual-level filters
  2. Table-level filters
  3. Page-level filters
  4. Report-level filters
  5. Column-level filters

 Ans: 1, 3, 4


__________ is a special kind of filter that allows us to filter a visual to just the top or bottom-ranked values based on a particular measure.
  1. Visual-level filter
  2. Top N filter
  3. Filters pane
  4. Basic filtering
  5. Report-level filter

 Ans: 2

 

In Power BI, the _______ icon locks a filter, so no one can change it when they run the report, and the _______ icon hides the filter completely, so no one can see it in the Filters pane when they run the report.

  1. key, glasses
  2. padlock, eyeball
  3. door, globe

 Ans: 2

What is the difference between a slicer and a filter?
  1. Slicers have three scopes: visual-level, page-level, and report-level
  2. A slicer puts a filter directly on your report page without using the filters pane
  3. Filters can only function as on-page filter controls
  4. None of the above
  5. All of the above

Ans: 2


 True or False: You can configure slicers so they only affect particular visuals on a page.

  1. True
  2. False

Ans: 1

 

What option allows us to view a visual's data in an Excel or CSV file?
  1. More Options
  2. Close & Apply
  3. Export Data
  4. Back to report
  5. Show Data

Ans: 3



The Merge Columns option gives you several methods for taking an existing column of data and breaking it out into two or more columns.

  1. True
  2. False
Ans: 2


Which of the following is not a use for the Column Profiler?

  1. You want to see which value occurs most in the column.
  2. You want to substitute new data in the column.
  3. You want to see how many empty cells exist in the column.
  4. You want to see if there are any errors in the column.

Ans: 2


Which join type would we use in a Merge Queries transformation if we wanted to keep only those records that exist in both the left and right table?

  1. Left outer
  2. Inner join
  3. Left Anti
  4. Right Anti


Ans: 2

True or False: Data profiling is the process of examining our data and summarizing information about it.

  1. True
  2. False

Ans: 1


Which of the following is NOT a correct reason to create an Index Column?

  1. To create a key that can be used to join two tables together
  2. To act as unique identifier for each record in our table
  3. To establish a date hierarchy

Ans: 3

The __________________ option in Power BI by default breaks down a column of dates into its individual components, making it easy to aggregate values at different periods.

  1. split date
  2. date hierarchy
  3. format date
  4. date table

Ans: 2

 

How did we get a count of the number of unique items in our data model?

  1. Use the DIVIDE function in DAX with the ShotID column
  2. Use the Dropdown feature in our slicer
  3. Use the DISTINCTCOUNT function in DAX with the ShotID column
  4. Create an index column with unique identifiers

Ans: 3


Which of the following options represent good uses for a scatter chart? (may be more than one)
  1. Plotting two measures to see how they might relate to each other
  2. Creating a geographic map of the data
  3. Ranking countries by a single measure or variable
  4. Plotting spatial (x,y) coordinates

Ans: 1,4