Saturday, 2 November 2019

SQL Server - CTE and Recursive CTE

Common Table Expression (CTE): is a temporary named result set. It must be followed by single Select, Insert, Update, Delete statement that references some/all columns defined in the select statement. We can use CTE in CREATE VIEW statement as part of the SELECT Statement.

Order by, into, option clause with query hints cannot be used in CTE query definition.

Example:

with cte as
(
 select * from emp where deptno = 10
)

select * from cte;

Output:







Recursive Common Table Expression: is that references itself. Recursive CTE definition must contain at least two CTE query definitions, an anchor member and a recursive member. Anchor members must combine by one of SET operators (UNION ALL, UNION, INTERSECT or EXPECT). UNION ALL is the only set operator used between the last anchor member and the first recursive member. No of columns and data types in an anchor and recursive members must be the same. 

SELECT DISTINCT, GROUP BY, HAVING, TOP, LEFT, RIGHT, OUTER JOIN, Sub Queries cannot be used in CTE Query definition of a recursive member.

Query to list employees details and their level of reporting using recursive cte.

Example:

with cte as
(select emp.empno,emp.ename,emp.mgr,dept.dname ,0 lvl 
 from emp join dept on emp.deptno = dept.DEPTNO
 where mgr is null
 union all
 select emp.empno,emp.ename,emp.mgr,dept.dname,cte.lvl+1 
 from emp 
 join dept on emp.deptno = dept.deptno 
 join cte  on emp.mgr = cte.empno
)
select * from cte;

Output:














ExampleFollowing query is used to display numbers from 1 to 110 using recursive CTE.

with cte as
(
select 1 lvl 
union all
select c.lvl+1 
from cte c where c.lvl<110
)
select * from cte;





When we run the above query, we will get an error (The statement terminated. The maximum recursion 100 has been exhausted before statement completion). By default, the maximum number of recursion allowed in CTE is 100. To overcome this error we use hint Maxrecursion. This hint specifies the maximum number of recursions allowed by query and the number of recursions is a positive integer between 0 and 32,767. 

Example:

with cte as

(select 1 lvl 
union all
select c.lvl+1 
from cte c where c.lvl<110
  )
select * from cte option(maxrecursion 110);


Output:
There is a scenario to display numbers up to 40,000. When we use maxrecursion (32767)

Example:

with cte as
(
select 1 lvl 
union all
select c.lvl+1 
from cte c where c.lvl<40000
)
select * from cte 
option (maxrecursion 32767);

Output:



After running the above query, we are getting an error (The statement terminated. The maximum recursion 32767 has been exhausted before statement completion). Because the output result has been crossed the maximum recursive limit of 32767. To overcome the above error, we can use option (maxrecursion 0).

option (maxrecursion 0): When 0 specified no limit is applied for recursion.

Example:
with cte as
(
select 1 lvl 
union all
select c.lvl+1 
from cte c where c.lvl<40000
)
select * from cte option (maxrecursion 0);

Output:


















Now we can see the numbers up to 40,000.

Wednesday, 30 October 2019

Hive - Difference between order by and sort by

Order by: the actual ordering of data is performed in a single node.


hive> select * from emp order by ename;
can use asc/ascending for ordering.
data will be order by ename ASCENDING by default.









hive> select * from emp order by ename desc/descending;
data will be order by ename DESCENDING.









Sort by: if data is stored across more than one node, partial sorting is done node wise, but it can't be 100% perfect sorting. Can use ascending/asc , descending/desc for ascending and descending sorting.

Following will display the data sorted by ename in ascending order. 

hive>select * from emp sort by ename asc;



Monday, 28 October 2019

Hive - Database Commands

What is Hive?

Hive is a data warehousing infrastructure for Hadoop. That converts SQL queries to a MapReduce(MR) program, and this MR runs in the YARN (Yet Another Resource Negotiator)  framework. Supports analysis of large data sets stored in Hadoop Distributed File System (HDFS). It was built for Data Mining Applications, which takes to process data from several minutes to several hours. Which is also called Hive Query Language (HQL). 

The following are some commands that are used in HQL.

Show DatabaseList all installed databases in the system.

hive> Show Databases;



use <dbname>: To set current working DB name by default. 

hive>use siva;



we can see the change, instead hive (default), we can see the change hive (siva). Currently we are working using siva database.

Show Tables: List all tables in the selected database

hive>show tables;













To list all data from table emp in Siva database.

hive> select * from siva.emp;


Limit: Command is used to get 'n' number of records from the table.
To list 5 records from emp table.

hive>select * from emp limit 5;





DescribeDisplays a basic structure of a given table.

hive>describe emp;











Describe Extended: To get additional details of table.

hive>describe extended emp;









Show Functions: To list all system functions.


hive>show functions;




Describe Function: To get function syntax and basic information of function. To get function XPath information.

hive>describe function XPath;







Describe Function Extended: To get function syntax and detailed information of function. 

hive>describe function extended Xpath;




!Clear: To clear the screen.
hive>!clear;

Saturday, 2 March 2019

SQL Server - Generate Random number using User Defined Functions

Calling RAND function in a user defined Function:

RAND() function in SQL Server returns random number between 0 and 1.


What happens when we call RAND function in a Function?

Creating an user-defined function to generate Random value:

Create function dbo.Fn_GenRand(@a int)
returns float
as
begin
return(select rand())
end

Msg 443, Level 16, State 1, Procedure Fn_GenRand, Line 5
Invalid use of a side-effecting operator 'rand' within a function.

While creating above function, we get error message by using RAND() function. Because RAND() function is of Non-Deterministic function.

In SQL Server, USER defined functions are basically divided into 2 types. 

1. Deterministic Functions always returns same output when they called with specific set of input values in the same state of database.

SQRT,ISNULL,ISNUMERIC,POWER,DATEDIFF,CEILING,FLOOR,DATEADD,DAY,MONTH,YEAR,SQUARE etc.

2. Non-Deterministic Functions always returns different output when they called with specific set of input values in the same state of database.

GETDATE,ROW_NUMBER,RAND,@@MAX_CONNECTIONS,@@CONNECTIONS,RANK,NEWID, @@TOTAL_WRITE etc.

How to use get Random Value using Function?

1. Create a view:

Create view [dbo].[vw_genrandom]
as

SELECT RAND(convert(varbinary,NEWID()))*100  AS val


2. Create function:

Create function dbo.Fn_GenRand(@a int)
returns float
as
begin
return(select * from vw_genrandom)*@a

end


3. Generate Random Number:

select  dbo.Fn_GenRand(10)

Output







Sunday, 10 February 2019

SQL Server - DDL operation on Linked Server

How to perform DDL Operations on Linked Server in SQL Server?


Linked server in SQL Server is a server object. Linked server are configured to enable database engine to execute T-SQL statements that includes tables in another instance of SQL server or cross platform databases such as Oracle.

How to create Linked Server in SQL server?

Following is the code to create linked server from SQL Server to Oracle.

BEGIN
IF EXISTS (SELECT srv.name FROM sys.servers srv WHERE srv.server_id != 0 AND srv.name = 'LINKEDSERVER')
EXEC master.dbo.sp_dropserver @server='LINKEDSERVER', @droplogins='droplogins'
EXEC sp_addlinkedserver  @server='LINKEDSERVER',@srvproduct='Oracle',@provider='OraOLEDB.Oracle', @datasrc='<SourceServerName>'
EXEC sp_addlinkedsrvlogin 'LINKEDSERVER', 'false', NULL, '<Source db userId>', '<Source db Password>'
EXEC master.dbo.sp_MSset_oledb_prop N'OraOLEDB.Oracle', N'AllowInProcess', 1
EXEC master.dbo.sp_MSset_oledb_prop N'OraOLEDB.Oracle', N'DynamicParameters', 1
EXEC master.dbo.sp_MSset_oledb_prop N'OraOLEDB.Oracle', N'NestedQueries', 1
EXEC master.dbo.sp_MSset_oledb_prop N'OraOLEDB.Oracle', N'SqlServerLIKE', 1
EXEC master.dbo.sp_serveroption @server=N'LINKEDSERVER', @optname=N'use remote collation', @optvalue=N'true'
END

user has to provide Oracle Server Name, UserId and Password of oracle instance in above code mentioned in <>.

How to get records counts for a given table in source database?

Following sql server script creates Tab_RowCount function in Oracle database using lnked server.

BEGIN
EXEC master.dbo.sp_serveroption 
                                                        @server=N'LINKEDSERVER', 
                                                        @optname=N'rpc out', 
                                                        @optvalue=N'true'
exec ('
create or replace
function Tab_RowCount( tablename in varchar2(128) ) return number
as
ColumnValue number default NULL;
begin
execute immediate
''select count(*)
from '' || tablename INTO ColumnValue;
return ColumnValue;
end;
') AT LINKEDSERVER
EXEC master.dbo.sp_serveroption 
                                                         @server=N'LINKEDSERVER', 
                                                         @optname=N'rpc out', 
                                                         @optvalue=N'false'
END

Execute function to get all tables count from oracle using LINKEDSERVER:

Following SQL Server script get all tables record count from oracle.

Select * from OPENQUERY (LINKEDSERVER,
                                              ' Select Table_Name, Tab_RowCount(User||''.''||table_name) Cnt
                                                from user_tables ')

Drop user created function using LINKEDSERVER

BEGIN
EXEC master.dbo.sp_serveroption @server=N'LINKEDSERVER', @optname=N'rpc out', @optvalue=N'true'
exec ('
drop 
function Tab_RowCount
') AT LINKEDSERVER
EXEC master.dbo.sp_serveroption @server=N'LINKEDSERVER', @optname=N'rpc out', @optvalue=N'false'
END





Facebook