simplify learn data

Tuesday, December 27, 2016

Big Data

Big data is a term that describes the large volume of data – both structured and unstructured – that inundates a business on a day-to-day basis. But it’s not the amount of data that’s important. It’s what organizations do with the data that matters. Big data can be analyzed for insights that lead to better decisions and strategic business moves.

While the term “big data” is relatively new, the act of gathering and storing large amounts of information for eventual analysis is ages old. The concept gained momentum in the early 2000s when industry analyst Doug Laney articulated the now-mainstream definition of big data as the three Vs:
Volume. Organizations collect data from a variety of sources, including business transactions, social media and information from sensor or machine-to-machine data. In the past, storing it would’ve been a problem – but new technologies (such as Hadoop) have eased the burden.
Velocity. Data streams in at an unprecedented speed and must be dealt with in a timely manner. RFID tags, sensors and smart metering are driving the need to deal with torrents of data in near-real time.
Variety. Data comes in all types of formats – from structured, numeric data in traditional databases to unstructured text documents, email, video, audio, stock ticker data and financial transactions.
At SAS, we consider two additional dimensions when it comes to big data:
Variability. In addition to the increasing velocities and varieties of data, data flows can be highly inconsistent with periodic peaks. Is something trending in social media? Daily, seasonal and event-triggered peak data loads can be challenging to manage. Even more so with unstructured data.
Complexity. Today's data comes from multiple sources, which makes it difficult to link, match, cleanse and transform data across systems. However, it’s necessary to connect and correlate relationships, hierarchies and multiple data linkages or your data can quickly spiral out of control.

Why Is Big Data Important?

The importance of big data doesn’t revolve around how much data you have, but what you do with it. You can take data from any source and analyze it to find answers that enable 1) cost reductions, 2) time reductions, 3) new product development and optimized offerings, and 4) smart decision making. When you combine big data with high-powered analytics, you can accomplish business-related tasks such as:

Determining root causes of failures, issues and defects in near-real time.
Generating coupons at the point of sale based on the customer’s buying habits.
Recalculating entire risk portfolios in minutes.
Detecting fraudulent behavior before it affects your organization.

Monday, December 19, 2016

SQL SELECT

The SQL SELECT statement returns a result set of records from one or more tables .

A SELECT statement retrieves zero or more rows from one or more database tables or database views. In most applications,

SELECT is the most commonly used data manipulation language (DML) command. As SQL is a declarative programming language, SELECT queries specify a result set, but do not specify how to calculate it. The database translates the query into a "query plan" which may vary between executions, database versions and database software. This functionality is called the "query optimizer" as it is responsible for finding the best possible execution plan for the query, within applicable constraints.

The SELECT statement has many optional clauses:

WHERE specifies which rows to retrieve.

GROUP BY groups rows sharing a property so that an aggregate function can be applied to each group.

HAVING selects among the groups defined by the GROUP BY clause.

ORDER BY specifies an order in which to return the rows.

AS provides an alias which can be used to temporarily rename tables or columns.

Examples

Table "T"

Query

Result

C1	C2
1	a
2	b

SELECT * FROM T;

C1	C2
1	a
2	b

C1	C2
1	a
2	b

SELECT C1 FROM T;

C1	C2
1	a
2	b

SELECT * FROM T WHERE C1 = 1;

C1	C2
1	a

C1	C2
1	a
2	b

SELECT * FROM T ORDER BY C1 DESC;

C1	C2
2	b
1	a

Given a table T, the query SELECT* FROM T will result in all the elements of all the rows of the table being shown.

With the same table, the querySELECT C1 FROM T will result in the elements from the column C1 of all the rows of the table being shown. This is similar to a projection inRelational algebra, except that in the general case, the result may contain duplicate rows. This is also known as a Vertical Partition in some database terms, restricting query output to view only specified fields or columns.

With the same table, the querySELECT * FROM T WHERE C1 = 1 will result in all the elements of all the rows where the value of column C1 is '1' being shown — in Relational algebra terms, a selection will be performed, because of the WHERE clause. This is also known as a Horizontal Partition, restricting rows output by a query according to specified conditions.

With more than one table, the result set will be every combination of rows. So if two tables are T1 and T2,SELECT * FROM T1, T2 will result in every combination of T1 rows with every T2 rows. E.g., if T1 has 3 rows and T2 has 5 rows, then 15 rows will result.

The SELECT clause specifies a list of properties (columns) by name, or the wildcard character (“*”) to mean “all properties”. Notice the special case of joinpropname, this provides for joins, but only on the jcr:path column, as described in 8.5.2 Database View. See also 6.6.3.1 Column Specifier.

SELECT Clause -- specifies the table columns retrieved
FROM Clause -- specifies the tables to be accessed
WHERE Clause -- specifies which rows in the FROM tables to use

SELECT Clause

The SELECT clause is mandatory. It specifies a list of columns to be retrieved from the tables in the FROM clause. It has the following general format:

SELECT [ALL|DISTINCT] select-list

select-list is a list of column names separated by commas. The ALL and DISTINCT specifiers are optional. DISTINCT specifies that duplicate rows are discarded. A duplicate row is when each corresponding select-list column has the same value. The default is ALL, which retains duplicate rows.

For example,

SELECT descr, color FROM p

The column names in the select list can be qualified by the appropriate table name:

SELECT p.descr, p.color FROM p

A column in the select list can be renamed by following the column name with the new name. For example:

SELECT name supplier, city location FROM s

This produces:

supplier	location
Pierre	Paris
John	London
Mario	Rome

The select list may also contain expressions.

A special select list consisting of a single '*' requests all columns in all tables in the FROM clause. For example,

SELECT * FROM sp

sno	pno	qty
S1	P1	NULL
S2	P1	200
S3	P1	1000
S3	P2	200

The * delimiter will retrieve just the columns of a single table when qualified by the table name. For example:

SELECT sp.* FROM sp

This produces the same result as the previous example.

An unqualified * cannot be combined with other elements in the select list; it must be stand alone. However, a qualified * can be combined with other elements. For example,

SELECT sp.*, city

FROM sp, s

WHERE sp.sno=s.sno

sno	pno	qty	city
S1	P1	NULL	Paris
S2	P1	200	London
S3	P1	1000	Rome
S3	P2	200	Rome

Note: this is an example of a query joining 2 tables.

FROM Clause

The FROM clause always follows the SELECT clause. It lists the tables accessed by the query. For example,

SELECT * FROM s

When the From List contains multiple tables, commas separate the table names. For example,

SELECT sp.*, city

FROM sp, s

WHERE sp.sno=s.sno

When the From List has multiple tables, they must be joined together. .

Correlation Names

Like columns in the select list, tables in the from list can be renamed by following the table name with the new name. For example,

SELECT supplier.name FROM s supplier

The new name is known as the correlation (or range) name for the table. Self joins require correlation names.

WHERE Clause

The WHERE clause is optional. When specified, it always follows the FROM clause. The WHERE clause filters rows from the FROM clause tables. Omitting the WHERE clause specifies that all rows are used.

Following the WHERE keyword is a logical expression, also known as a predicate.

The predicate evaluates to a SQL logical value -- true, false or unknown. The most basic predicate is a comparison:

color = 'Red'

This predicate returns:

true -- if the color column contains the string value -- 'Red',
false -- if the color column contains another string value (not 'Red'), or
unknown -- if the color column contains null.

Generally, a comparison expression compares the contents of a table column to a literal, as above. A comparison expression may also compare two columns to each other. Table joins use this type of comparison.

The = (equals) comparison operator compares two values for equality. Additional comparison operators are:

> -- greater than
< -- less than
>= -- greater than or equal to
<= -- less than or equal to
<> -- not equal to

For example,

SELECT * FROM sp WHERE qty >= 200

sno	pno	qty
S2	P1	200
S3	P1	1000
S3	P2	200

Note: In the sp table, the qty column for one of the rows contains null. The comparison - qty >= 200, evaluates to unknown for this row. In the final result of a query, rows with a WHERE clause evaluating tounknown (or false) are eliminated (filtered out).

Both operands of a comparison should be the same data type, however automatic conversions are performed between numeric, datetime and interval types. The CAST expression provides explicit type conversions;

Extended Comparisons

In addition to the basic comparisons described above, SQL supports extended comparison operators -- BETWEEN, IN, LIKE and IS NULL.

BETWEEN Operator

The BETWEEN operator implements a range comparison, that is, it tests whether a value is between two other values. BETWEEN comparisons have the following format:

value-1 [NOT] BETWEEN value-2 AND value-3

This comparison tests if value-1 is greater than or equal to value-2 and less than or equal to value-3. It is equivalent to the following predicate:

value-1 >= value-2 AND value-1 <= value-3

Or, if NOT is included:

NOT (value-1 >= value-2 AND value-1 <= value-3)

For example,

SELECT *

FROM sp

WHERE qty BETWEEN 50 and 500

sno	pno	qty
S2	P1	200
S3	P2	200

IN Operator

The IN operator implements comparison to a list of values, that is, it tests whether a value matches any value in a list of values. IN comparisons have the following general format:

value-1 [NOT] IN ( value-2 [, value-3] ... )

This comparison tests if value-1 matches value-2 or matches value-3, and so on. It is equivalent to the following logical predicate:

value-1 = value-2 [ OR value-1 = value-3 ] ...

or if NOT is included:

NOT (value-1 = value-2 [ OR value-1 = value-3 ] ...)

For example,

SELECT name FROM s WHERE city IN ('Rome','Paris')

name

Pierre

Mario

LIKE Operator

The LIKE operator implements a pattern match comparison, that is, it matches a string value against a pattern string containing wild-card characters.

The wild-card characters for LIKE are percent -- '%' and underscore -- '_'. Underscore matches any single character. Percent matches zero or more characters.

Examples,

Match Value	Pattern	Result
'abc'	'_b_'	True
'ab'	'_b_'	False
'abc'	'%b%'	True
'ab'	'%b%'	True
'abc'	'a_'	False
'ab'	'a_'	True
'abc'	'a%_'	True
'ab'	'a%_'	True

LIKE comparison has the following general format:

value-1 [NOT] LIKE value-2 [ESCAPE value-3]

All values must be string (character). This comparison uses value-2 as a pattern to match value-1. The optional ESCAPE sub-clause specifies an escape character for the pattern, allowing the pattern to use '%' and '_' (and the escape character) for matching. The ESCAPE value must be a single character string. In the pattern, the ESCAPE character precedes any character to be escaped.

For example, to match a string ending with '%', use:

x LIKE '%/%' ESCAPE '/'

A more contrived example that escapes the escape character:

y LIKE '/%//%' ESCAPE '/'

... matches any string beginning with '%/'.

The optional NOT reverses the result so that:

z NOT LIKE 'abc%'

is equivalent to:

NOT z LIKE 'abc%'

IS NULL Operator

A database null in a table column has a special meaning -- the value of the column is not currently known (missing), however its value may be known at a later time. A database null may represent any value in the future, but the value is not available at this time. Since two null columns may eventually be assigned different values, one null can't be compared to another in the conventional way. The following syntax is illegal in SQL:

WHERE qty = NULL

A special comparison operator -- IS NULL, tests a column for null. It has the following general format:

value-1 IS [NOT] NULL

This comparison returns true if value-1 contains a null and false otherwise. The optional NOT reverses the result:

value-1 IS NOT NULL

is equivalent to:

NOT value-1 IS NULL

For example,

SELECT * FROM sp WHERE qty IS NULL

sno	pno	qty
S1	P1	NULL

Logical Operators

The logical operators are AND, OR, NOT. They take logical expressions as operands and produce a logical result (True, False, Unknown). In logical expressions, parentheses are used for grouping.

AND Operator

The AND operator combines two logical operands. The operands are comparisons or logical expressions. It has the following general format:

predicate-1 AND predicate-2

AND returns:

True -- if both operands evaluate to true
False -- if either operand evaluates to false
Unknown -- otherwise (one operand is true and the other is unknown or both are unknown)

Truth tables for AND:

AND	T	F	U
T	T	F	U
F	F	F	F
U	U	F	U

Input 1	Input 2	AND Result
True	True	True
True	False	False
False	False	False
False	True	False
Unknown	Unknown	Unknown
Unknown	True	Unknown
Unknown	False	False
True	Unknown	Unknown
False	Unknown	False

For example,

SELECT *

FROM sp

WHERE sno='S3' AND qty < 500

sno	pno	qty
S3	P2	200

OR Operator

The OR operator combines two logical operands. The operands are comparisons or logical expressions. It has the following general format:

predicate-1 OR predicate-2

OR returns:

True -- if either operand evaluates to true
False -- if both operands evaluate to false
Unknown -- otherwise (one operand is false and the other is unknown or both are unknown)

Truth tables for OR:

OR	T	F	U
T	T	T	T
F	T	F	U
U	T	U	U

Input 1	Input 2	OR Result
True	True	True
True	False	True
False	False	False
False	True	True
Unknown	Unknown	Unknown
Unknown	True	True
Unknown	False	Unknown
True	Unknown	True
False	Unknown	Unknown

For example,

SELECT *

FROM s

WHERE sno='S3' OR city = 'London'

sno	name	city
S2	John	London
S3	Mario	Rome

AND has a higher precedence than OR, so the following expression:

a OR b AND c

is equivalent to:

a OR (b AND c)

NOT Operator

The NOT operator inverts the result of a comparison expression or a logical expression. It has the following general format:

NOT predicate-1

Truth tables for NOT:

NOT
T	F
F	T
U	U

Input	NOT Result
True	False
False	True
Unknown	Unknown

Example query:

SELECT *

FROM sp

WHERE NOT sno = 'S3'

sno	pno	qty
S1	P1	NULL
S2	P1	200

Sunday, December 18, 2016

Different Types of Databases

here are different types of databases which are categorised on the basis of their function. The top 12 of these which you may come across are:

1.0 Relational Databases

This is the most common of all the different types of databases. In this, the data in a relational database is stored in various data tables. Each table has a key field which is used to connect it to other tables. Hence all the tables are related to each other through several key fields. These databases are extensively used in various industries and will be the one you are most likely to come across when working in IT.
Examples of relational databases are Oracle, Sybase and Microsoft SQL Server and they are often key parts of the process of software development. Hence you should ensure you include any work required on the database as part of your project when creating a project plan and estimating project costs.

2.0 Operational Databases

In its day to day operation, an organisation generates a huge amount of data. Think of things such as inventory management, purchases, transactions and financials. All this data is collected in a database which is often known by several names such as operational/ production database, subject-area database (SADB) or transaction databases.
An operational database is usually hugely important to Organisations as they include the customer database, personal database and inventory database ie the details of how much of a product the company has as well as information on the customers who buy them. The data stored in operational databases can be changed and manipulated depending on what the company requires.

3.0 Database Warehouses

Organisations are required to keep all relevant data for several years. In the UK it can be as long as 6 years. This data is also an important source of information for analysing and comparing the current year data with that of the past years which also makes it easier to determine key trends taking place. All this data from previous years are stored in a database warehouse. Since the data stored has gone through all kinds of screening, editing and integration it does not need any further editing or alteration.
With this database ensure that the software requirements specification (SRS) is formally approved as part of the project quality plan.

4.0 Distributed Databases

Many organisations have several office locations, manufacturing plants, regional offices, branch offices and a head office at different geographic locations. Each of these work groups may have their own database which together will form the main database of the company. This is known as a distributed database.

5.0 End-User Databases

There is a variety of data available at the workstation of all the end users of any organisation. Each workstation is like a small database in itself which includes data in spreadsheets, presentations, word files, note pads and downloaded files. All such small databases form a different type of database called the end-user database.

6.0 External Database

There is a sea of information available outside world which is required by an organisation. They are privately-owned data for which one can have conditional and limited access for a fortune. This data is meant for commercial usage. All such databases outside the organisation which are of use and limited access are together called external database.

7.0 Hypermedia Database

Most websites have various interconnected multimedia pages which might include text, video clips, audio clips, photographs and graphics. These all need to be stored and “called” from somewhere when the webpage if created. All of them together form the hypermedia database.
Please note that if you are creating such a database from scratch to be generous when creating a project plan, detailed when defining the business requirements documentation (BRD) and meticulous in your project cost controls. I have seen too many projects where the creation of one of these databases has caused scope creep and an out of control budget for a project.

8.0 Navigational Database

Navigational database has all the items which are references from other objects. In this, one has to navigate from one reference to other or one object to other. It might be using modern systems like XPath. One of its applications is the air flight management systems.

9.0 In-Memory Database

An in-memory databases stores data in a computer’s main memory instead of using a disk-based storage system. It is faster and more reliable than that in a disk. They find their application in telecommunications network equipments.

10.0 Document-Oriented Database

A document oriented database is a different type of database which is used in applications which are document oriented. The data is stored in the form of text records instead of being stored in a data table as usually happens.

11.0 Real-Time Database

A real-time database handles data which constantly keep on changing. An example of this is a stock market database where the value of shares change every minute and need to be updated in the real-time database. This type of database is also used in medical and scientific analysis, banking, accounting, process control, reservation systems etc. Essentially anything which requires access to fast moving and constantly changing information.
Assume that this will require much more time than a normal relational database when it comes to the software testing life cycle, as these are much more complicated to efficiently test within normal timeframes.

12.0 Analytical Database

An analytical database is used to store information from different types of databases such as selected operational databases and external databases. Other names given to analytical databases are information databases, management databases or multi-dimensional databases. The data stored in an analytical database is used by the management for analysis purposes, hence the name. The data in an analytical database cannot be changed or manipulated.

Different Types of Databases Top 12 - Tip

Of the different types of databases, relational is the most common and includes such well known names as Oracle, Sybase and SQL Server. However as a project manager you need to be prepared for anything, hence why having a high level view of the different databases is useful particularly when managing a software development life cycle. Regarding the remainder, you will hear a great deal about database warehouses. This is a highly specialised area which involves mining the data produced to generate meaningful trends and reports for senior management to act upon.

source
http://www.my-project-management-expert.com/different-types-of-databases.html

http://www.my-project-management-expert.com/different-types-of-databases-2.html

Tuesday, December 27, 2016

Big Data

Why Is Big Data Important?

Monday, December 19, 2016

SQL SELECT

Examples

Sunday, December 18, 2016

Different Types of Databases

Different Types of Databases

1.0 Relational Databases

2.0 Operational Databases

3.0 Database Warehouses

4.0 Distributed Databases

5.0 End-User Databases

6.0 External Database

7.0 Hypermedia Database

8.0 Navigational Database

9.0 In-Memory Database

10.0 Document-Oriented Database

11.0 Real-Time Database

12.0 Analytical Database

Different Types of Databases Top 12 - Tip

Popular Posts