In Postgres, cluster means multiple databases, not multiple instances. One cluster is one instance.
Each database can have multiple schemas. Schemas are containers of objects. We use schemas to divide our objects into groups with logical meaning. We have the default public schema. Then we can have domain specific schemas wile HR etc.
You cannot do koins between tables in two separate databases, even though they are in same instance. Databases are really isolated from eachother. You only do a join through a database link.
Internally, in the host, each database is a directory.
For bulk import and exports, use: COPY
Query Execution plan. Command EXPLAIN allows us to observe the Plan Tree. EXPLAIN ANALYZE gets real data after a real execution.
Heap implements the sequential scan in postgresql.
*******POSTGRES**********
– ORDER BY is usually used at the end of a statement
– GROUP BY : used to divide rows returned from a SELECT statement into groups. You can also apply aggregate functions like SUM and COUNT
– HAVING clause is used to eliminate groups of rows that do not satisfy a specific condition. HAVING is used after GROUP BY.
– TRUNCATE : remove all data from the table
– DROP : delete the table and the data.
– SUBQUERY : query nested inside another query. Used in SELECT, UPDATE and INSERT statements. To construct a subquery, the second query is placed in brackets. You also use WHERE clause as expression in the subquery.You can also use the subquery with IN operator and HAVING operator
So subquery is a query statement where you have another query embedded in the WHERE or HAVING clauses
*************************
Windows Frame – it is a filtered portion of partition
units available for frame are rows and range.
*************************
***Within OVER() we are required to give an ORDER BY clause.
***In scenarios where we don’t care about the ordering, we can do like this: OVER (ORDER BY (SELECT NULL))
***See Example
SELECT Sales_Customer_Id , SUM(Sales_Amount) AS Cust_Total , SUM(SUM(Sales_Amount)) OVER (ORDER BY (SELECT NULL)) AS Grand_Total FROM dbo.Sales GROUP BY Sales_Customer_Id
*************************
Window Functions can only be included within SELECT or ORDER BY clauses.
Functions Available:
Aggregate – COUNT, SUM, MIN, MAX, AVG
Ranking – ROW_NUMBER, RANK, DENSE_RANK, NTILE
Offset – FIRST_VALUE, LAST_VALUE, LEAD, LAG
Statistical – PERCENT_RANK, CUME_DIST, PERCENTILE_CONT, PERCENTILE_DIST
Windows Functions also have FRAMES
ROWS
RANGE
*************************
UPDATE salary SET sex = CASE sex WHEN 'm' THEN 'f' ELSE 'm' END;
************************
WITH CTE AS ( SELECT Sales_Id , SUM(Line_Total) AS Total FROM Sales_Details GROUP BY Sales_Id ) SELECT * FROM CTE AS A INNER JOIN Sales_Details AS B ON A.Sales_Id = B.Sales_Id
************************
SELECT Sales_Id , Sales_Date , Item , Price , Quantity , Line_Total , COUNT(Line_Total) OVER(PARTITION BY Sales_Id) AS Line_Count , SUM(Line_Total) OVER(PARTITION BY Sales_Id) AS Sales_Total , SUM(Line_Total) OVER(PARTITION BY Sales_Date) AS Daily_Total , SUM(Line_Total) OVER() AS Total FROM Sales_Details ORDER BY Sales_Total
*************************
*** window funcs with group by ***
SELECT Sales_Cust_Id , SUM(Sales_Total) AS Total , RANK() OVER(ORDER BY SUM(Sales_Total) DESC) AS rnk , DENSE_RANK() OVER(ORDER BY SUM(Sales_Total) DESC) AS dnse FROM dbo.Sales_2 GROUP BY Sales_Cust_Id ORDER BY rnk
*************************
*** Window funs – Calculate running totals/ averages ***
*** Scenario: We want to calculate the running total and average of customers spend ***
SELECT Sales_Id , Sales_Date , Sales_Total , Sales_Cust_Id , SUM(Sales_Total) OVER(PARTITION BY Sales_Cust_Id ORDER BY Sales_Date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT_ROW) AS [Running Total] FROM dbo.Sales_2 ORDER BY Sales_Date
**************************
***Lag and Lead are useful for performing trend analysis, in the example I show how we can display a customer spending trend.
***Lag will show the previous value.
***Lead will show the next value.
***Lag and Lead accept multiple parameters as demonstrated in the video:
***LAG([Column], [Offset], [Value if NULL])
***The example of LAG and LEAD in the video can be shown by executing the below SQL query:
SELECT Sales_Customer_Id , Sales_Date , LAG(Sales_Amount, 2, 0) OVER(PARTITION BY Sales_Customer_Id ORDER BY Sales_Date) AS PrevValue , Sales_Amount , LEAD(Sales_Amount, 2, 0) OVER(PARTITION BY Sales_Customer_Id ORDER BY Sales_Date) AS NextValue FROM dbo.Sales
**************************
***Window Functions – Calculate Running Differences
WITH CTE AS ( SELECT Sales_Customer_Id , Sales_Date , Sales_Amount , LAG(Sales_Amount) OVER(PARTITION BY Sales_Customer_Id ORDER BY Sales_Date) AS PrevValue , Sales_Amount - LAG(Sales_Amount) OVER(PARTITION BY Sales_Customer_Id ORDER BY Sales_Date) AS RunningDifference FROM dbo.Sales ) SELECT Sales_Customer_Id , AVG(RunningDifference) AS AverageDifference FROM CTE GROUP BY Sales_Customer_Id ORDER BY AverageDifference DESC;
******************************
***Show Totals with GROUP BY using OVER
SELECT Sales_Customer_Id , SUM(Sales_Amount) AS Cust_Total , SUM(SUM(Sales_Amount)) OVER(ORDER BY (SELECT NULL)) AS Grand_Total , AVG(SUM(Sales_Amount)) OVER(ORDER BY (SELECT NULL)) AS Average_Cust_Total , CAST((SUM(Sales_Amount) / SUM(SUM(Sales_Amount)) OVER(ORDER BY (SELECT NULL))) * 100 AS DECIMAL(6,2)) AS Pct FROM dbo.Sales GROUP BY Sales_Customer_Id