Thursday, 2 July 2015

DELL BOOMI ATOMSPHERE - Core Terms and Concepts

There are several important terms and concepts to understand when working with Dell Boomi AtomSphere:

Atom
Atoms are the lightweight run-time engines that contain process configuration information and actually do the work.

Atoms can be downloaded and installed to a machine behind your firewall to access local resources such as on-premise applications or databases, or deployed to the Dell Boomi data center if only web accessible applications or data sources need to be integrated. You will typically deploy multiple integration processes to a single Atom.

Component
Components are the reusable configuration objects used in processes.

Components are reusable and include certificates, connections, connector operations, cross reference tables, document caches, maps, map functions, processes, process properties, profiles, queues, trading partners and web services. 

Connector
Connectors get data into and send data out of processes.

Connectors abstract the technical details of communicating with various applications, data sources and communication protocols. They are actually comprised of two components: a connection and an operation. The connection represents the "endpoint" and contains the physical connection details, such as an FTP or database host, a web services URL and/or login credentials. The operation represents a specific action to perform against that connection, such as a database SELECT query, an FTP PUT or a specific web service call. You can think of the connection as the "where" and the operation as the "how". For example, when extracting customer records from Salesforce, the connection represents your Salesforce organization account user name and password and the operation represents the "query customer" action.

Execution
An execution is the result of running a process.

Map
Maps allow data to be transformed from one format to another — or specifically, from one profile to another.

Fields or elements are mapped by dragging and dropping a source element to a destination element. Map functions can be used to perform simple or complex field-level manipulations as values are mapped.

Process
Processes represent a single type of data to integrate between two or more systems. They are comprised of a series of steps (represented in the UI by shapes) that specify the business logic and transformation requirements.

Processes are the central components in AtomSphere. Process steps transform, route and otherwise manipulate data to attain the desired end result. Every process begins with a Start shape that gets data from some source and usually ends with one or more connectors to send the data to the destination(s).

Profile
Profiles define the structure or layout of the data to be processed. They are used when reading from or writing to a given format. Several types of profiles are available:
Database — When used for reading, the database profile represents the fields in the result set(s) of SELECT statements. When used for writing, the profile represents the values used in an UPDATE, INSERT or DELETE statement. Database profiles can also contain elements to be used as parameters in dynamic statements.

EDI — Electronic Data Interchange (EDI) is defined as the exchange of business data from one computer to another computer using a public standard format. EDI replaces the traditional processes of preparing data in paper form and sending it by mail or by facsimile. Business communications with EDI is not restricted by software, equipment or computer. EDI profiles represent the structure of the various EDI documents sent through AtomSphere processes, including all the segments, data elements, looping and other configuration information. Similar to the XML profile, the EDI profile allows you to configure repeating data sets (loops). You can also organize segments in Header, Detail and Summary loop sections.

Flat File — Flat files are common, simple formats that typically contain one record per line, like a comma separated value (CSV) file. Each record contains a number of fields or elements that are either delimited or data positioned (fixed width). Flat file profiles can represent simple structures with a single record type (e.g., CSV file) to moderately complex structures with multiple record types to accommodate parent-child relationships or other repeating record types within the same file (e.g., header and detail records).

XML — XML is a common hierarchical format used in much of the Internet communication today characterized by the use of tags that describe data elements. Tags can be nested within one another to establish parent-child relationships. XML profiles are used by web services and the majority of AtomSphere application connectors.

JSON (JavaScript Object Notation) is a text-based data interchange format. It is easy for humans to read and write and for machines to parse and generate. The JSON format is often used to transmit structured data over a network connection. It is similar to XML and is often used as an alternative to XML.

Shape
Process shapes are the steps linked together to form the business logic for a process.
Shapes can perform execution tasks like document splitting and data transformation, or logic tasks like decision and routing. Each shape has its own configuration settings. Shapes often reference components such as profiles, connections, operations, and maps.


Wednesday, 1 July 2015

DELL BOOMI ATOMSPHERE

The main component in Dell Boomi AtomSphere is the process. A process represents a business process- or transaction-level interface between two systems. Examples of a process might be "Salesforce Account Synchronization to a legacy accounting system" or "Sales Orders from Company ABC to QuickBooks." Processes contain a left-to-right series of shapes connected together like a flow chart to illustrate the steps required to transform, route and otherwise manipulate the data from source to destination.

Processes use connectors to get and send data. Every process begins with the Start shape (containing the inbound connector) on the left. A process can have any number of execution and logic shapes within it. A process typically ends with one or more outbound connectors on the right.

Profiles represent the structure and format of the source and destination data. The following types of profiles are available: Database, EDI, Flat File, XML and JSON. A process typically contains one or more maps along the way that transform the source profile into the destination profile.

When a process is executed, the inbound connector in the Start shape retrieves its data (for example: performs an FTP GET, executes a database SELECT query or invokes a web service call) and passes the resulting data into the process as one or more documents. Documents move through the process steps sequentially, where each shape performs a specific task on the data and then passes it onto the next shape.

What Is Integration?

Integration is the combining of software or hardware components or both into an overall system. Generally speaking, there are two types of systems integration:

B2B Integration — Integrating data or systems across two or more organizations.

Application Integration — Integrating data or systems within a single organization.

What is B2B Integration?

Business-to-Business integration is not a new concept and in fact many Information technology organisations have been running B2B Integration projects since the late 1960’s. Simply put B2B Integration means the integration, automation and optimisation of key business processes that extend outside the four walls of a companies organisation.

While these critical processes vary by vertical, geography and company size one point remains consistent, the automation of key external business processes, those that touch your customers and your suppliers delivers sustainable competitive advantage.

For example receiving purchase orders from your customers electronically, you can process order information faster and more accurately. Processing these orders in real time allows companies to be more responsive to their customers, improve customer service and increase sales. Similarly, by connecting to external suppliers electronically, companies can achieve real time views into the visibility of global shipments, automating the warehouse or distribution centres and optimising inventory or stock control – ultimately increasing working capital and lowering costs.

B2B Integration began with large companies mandating methods of receiving business information technology. It evolved through the widespread adoption of Electronic Data Interchange (EDI) and in recent years has benefited from technology innovations e.g. the advent of the Internet, XML, web services and SOA, Business Process Management and SaaS. These innovations have led to increased benefits being made available to companies of every size. As we explore in this Microsite there are a number of ways to implement B2B Integration solutions. We discuss that the solution approach should be driven by a company’s business needs and objectives, rather than a particular implementation or technology set.

https://en.wikipedia.org/wiki/Business-to-business

iPaaS, Native Cloud, Multi-Tenancy

Integration Platform as a Service (iPaaS) is a suite of cloud services enabling development, execution and governance of integration flows connecting any combination of on premises and cloud-based processes, services, applications and data within individual or across multiple organizations.



Native Cloud Application (NCA) refers to a type of computer software that natively utilizes services and infrastructure provided by cloud computing providers



Multi-tenancy is an architecture in which a single instance of a software application serves multiple customers. Each customer is called a tenant. Tenants may be given the ability to customize some parts of the application, such as color of the user interface (UI) or business rules, but they cannot customize the application's code.

Multi-tenancy can be economical because software development and maintenance costs are shared. It can be contrasted with single-tenancy, an architecture in which each customer has their own software instance and may be given access to code. With a multi-tenancy architecture, the provider only has to make updates once. With a single-tenancy architecture, the provider has to touch multiple instances of the software in order to make updates.

In cloud computing, the meaning of multi-tenancy architecture has broadened because of new service models that take advantage of virtualization and remote access. A software-as-a-service (SaaS) provider, for example, can run one instance of its application on one instance of a database and provide web access to multiple customers. In such a scenario, each tenant's data is isolated and remains invisible to other tenants.

Tuesday, 30 June 2015

DELL BOOMI

The main component in Dell Boomi AtomSphere is the process. A process represents a business process- or transaction-level interface between two systems.

Processes contain a left-to-right series of shapes connected together like a flow chart to illustrate the steps required to transform, route and otherwise manipulate the data from source to destination.

Processes use connectors to get and send data. Every process begins with the Start shape on the left. A process can have any number of execution and logic shapes within it. A process typically ends with one or more outbound connectors on the right.

Profiles represent the structure and format of the source and destination data. The following types of profiles are available: Database, EDI, Flat File, XML and JSON. A process typically contains one or more maps along the way that transform the source profile into the destination profile.

When a process is executed, the inbound connector in the Start shape retrieves its data (for example: performs an FTP GET, executes a database SELECT query or invokes a web service call) and passes the resulting data into the process as one or more documents. Documents move through the process steps sequentially, where each shape performs a specific task on the data and then passes it onto the next shape.

AtomSphere Editions

You can get a 30–day free trial account at any time. It is a separate edition of Dell Boomi AtomSphere called “Personal Edition”. Once your trial period has expired you have the option to fully subscribe to AtomSphere. Then you can deploy the integration processes you have built during your trial, subject to the same terms.

When you purchase AtomSphere you have the choice of four editions:
    • Base Edition
    • Professional Edition
    • Professional Plus Edition
    • Enterprise Edition
    • Enterprise Plus Edition

Tuesday, 12 May 2015

Transformations, Steps, and Hops

transformation is a network of logical tasks called steps. Transformations are essentially data flows. In the example below, the database developer has created a transformation that reads a flat file, filters it, sorts it, and loads it to a relational database table. Suppose the database developer detects an error condition and instead of sending the data to a Dummy step, (which does nothing), the data is logged back to a table. The transformation is, in essence, a directed graph of a logical set of data transformation configurations. Transformation file names have a .ktr extension.

The two main components associated with transformations are steps and hops:
Steps are the building blocks of a transformation, for example a text file input or a table output. There are over 140 steps available in Pentaho Data Integration and they are grouped according to function; for example, input, output, scripting, and so on. Each step in a transformation is designed to perform a specific task, such as reading data from a flat file, filtering rows, and logging to a database as shown in the example above. Steps can be configured to perform the tasks you require.
Hops are data pathways that connect steps together and allow schema metadata to pass from one step to another. In the image above, it seems like there is a sequential execution occurring; however, that is not true. Hops determine the flow of data through the steps not necessarily the sequence in which they run. When you run a transformation, each step starts up in its own thread and pushes and passes data.
Note: All steps are started and run in parallel so the initialization sequence is not predictable. That is why you cannot, for example, set a variable in a first step and attempt to use that variable in a subsequent step.
You can connect steps together, edit steps, and open the step contextual menu by clicking to edit a step. Click the down arrow to open the contextual menu. For information about connecting steps with hop, see More About Hops.
A step can have many connections — some join two steps together, some only serve as an input or output for a step. The data stream flows through steps to the various steps in a transformation. Hops are represented in Spoon as arrows. Hops allow data to be passed from step to step, and also determine the direction and flow of data through the steps. If a step sends outputs to more than one step, the data can either be copied to each step or distributed among them.

Spoon - Editor

Component NameNameFunction
ToolbarSingle-click access to common actions such as create a new file, opening existing documents, save and save as.
Perspectives ToolbarSwitch between the different perspectives.
  • Data Integration — Create ETL transformations and jobs
  • Instaview — Use pre-made templates to create visualizations from PDI transformations
  • Visualize — Test reporting and OLAP metadata models created in the Model perspective using the Report Design Wizard and Analyzer clients
  • Model Editor — Design reporting and OLAP metadata models which can be tested right from within the Visualization perspective or published to the Pentaho BA Server
  • Schedule — Manage scheduled ETL activities on the Data Integration Server
Sub-toolbarProvides buttons for quick access to common actions specific to the transformation or job such as RunPreview, and Debug.
Design and View Tabs
The Design tab of the Explore pane provides an organized list of transformation steps or job entries used to build transformations and jobs. Transformations are created by simply dragging transformation steps from the Designtab onto the canvas and connecting them with hops to describe the flow of data.
The View tab of the Explore pane shows information for each job or transformation. This includes information such as available database connections and which steps and hops are used.
In the image, the Design tab is selected.
CanvasMain design area for building transformations and jobs describing the ETL activities you want to perform



Table 1. Spoon Icon Descriptions
IconDescription
Create a new job or transformation
Open transformation/job from file if you are not connected to a repository or from the repository if you are connected to one
Explore the repository
Save the transformation/job to a file or to the repository
Save the transformation/job under a different name or file name (Save as)
Run transformation/job; runs the current transformation from XML file or repository
Pause transformation
Stop transformation
Preview transformation: runs the current transformation from memory. You can preview the rows that are produced by selected steps.
Run the transformation in debug mode; allows you to troubleshoot execution errors
Replay the processing of a transformation
Verify transformation
Run an impact analysis on the database
Generate the SQL that is needed to run the loaded transformation.
Launch the database explorer allowing you to preview data, run SQL queries, generate DDL and more
Hide execution results pane
Lock transformation

Pentaho Data Integration Architecture

Spoon is the design interface for building ETL jobs and transformations. Spoon provides a drag-and-drop interface that allows you to graphically describe what you want to take place in your transformations. Transformations can then be executed locally within Spoon, on a dedicated Data Integration Server, or a cluster of servers.
The Data Integration Server is a dedicated ETL server whose primary functions are:
ExecutionExecutes ETL jobs and transformations using the Pentaho Data Integration engine
SecurityAllows you to manage users and roles (default security) or integrate security to your existing security provider such as LDAP or Active Directory
Content ManagementProvides a centralized repository that allows you to manage your ETL jobs and transformations. This includes full revision history on content and features such as sharing and locking for collaborative development environments.
SchedulingProvides the services allowing you to schedule and monitor activities on the Data Integration Server from within the Spoon design environment.
Pentaho Data Integration is composed of the following primary components:
  • Spoon. Introduced earlier, Spoon is a desktop application that uses a graphical interface and editor for transformations and jobs. Spoon provides a way for you to create complex ETL jobs without having to read or write code. When you think of Pentaho Data Integration as a product, Spoon is what comes to mind because, as a database developer, this is the application on which you will spend most of your time. Any time you author, edit, run or debug a transformation or job, you will be using Spoon.
  • Pan. A standalone command line process that can be used to execute transformations and jobs you created in Spoon. The data transformation engine Pan reads data from and writes data to various data sources. Pan also allows you to manipulate data.
  • Kitchen. A standalone command line process that can be used to execute jobs. The program that executes the jobs designed in the Spoon graphical interface, either in XML or in a database repository. Jobs are usually scheduled to run in batch mode at regular intervals.
  • Carte. Carte is a lightweight Web container that allows you to set up a dedicated, remote ETL server. This provides similar remote execution capabilities as the Data Integration Server, but does not provide scheduling, security integration, and a content management system.

What's with all the Culinary Terms?

If you are new to Pentaho, you may sometimes see or hear Pentaho Data Integration referred to as, "Kettle." To avoid confusion, all you must know is that Pentaho Data Integration began as an open source project called. "Kettle." The term, K.E.T.T.L.E is a recursive that stands for KettleExtraction Transformation Transport Load Environment. When Pentaho acquired Kettle, the name was changed to Pentaho Data Integration. Other PDI components such as Spoon, Pan, and Kitchen, have names that were originally meant to support a "restaurant" metaphor of ETL offerings.

Pentaho - Introduction

Introduction

Pentaho Data Integration (PDI) is a flexible tool that allows you to collect data from disparate sources such as databases, files, and applications, and turn the data into a unified format that is accessible and relevant to end users. PDI provides the Extraction, Transformation, and Loading (ETL) engine that facilitates the process of capturing the right data, cleansing the data, and storing the data using a uniform and consistent format.
PDI provides support for slowly changing dimensions, and surrogate key for data warehousing, allows data migration between databases and application, is flexible enough to load giant datasets, and can take full advantage of cloud, clustered, and massively parallel processing environments. You can cleanse your data using transformation steps that range from very simple to very complex.

Friday, 8 May 2015

Pentaho Data Integration (Kettle)

  • Migrating data between applications or databases
  • Exporting data from databases to flat files
  • Loading data massively into databases
  • Data cleansing
  • Integrating applications
PDI is easy to use. Every process is created with a graphical tool where you specify what to do without writing code to indicate how to do it; because of this, you could say that PDI is metadata oriented.
PDI can be used as a standalone application, or it can be used as part of the larger Pentaho Suite. As an ETL tool, it is the most popular open source tool available. PDI supports a vast array of input and output formats, including text files, data sheets, and commercial and free database engines. Moreover, the transformation capabilities of PDI allow you to manipulate data with very few limitations.

Difference Between Algorithm and Pseudocode

Algorithm vs Pseudocode 
An algorithm is simply a solution to a problem. An algorithm presents the solution to a problem as a well defined set of steps or instructions. Pseudo-code is a general way of describing an algorithm. Pseudo-code does not use the syntax of a specific programming language, therefore cannot be executed on a computer. But it closely resembles the structure of a programming language and contains roughly the same level of detail.
Algorithm
An algorithm gives a solution to a particular problem as a well defined set of steps. A recipe in a cookbook is a good example of an algorithm. When a computer is used for solving a particular problem, the steps to the solution should be communicated to the computer. This makes the study of algorithms a very important part in computer science. An algorithm is executed in a computer by combining lot of elementary operations such as additions and subtractions to perform more complex mathematical operations. But translating the idea of the algorithm in to computer code is not straight forward. Specially, converting an algorithm in to a low level language such as assembly language could be very tedious than using a high level language such as C or Java. When designing an algorithm, it is important to do an analysis on the resources (such as time and storage) required by the algorithm. Notations such as big O notation are used for performing time and storage analysis on algorithms. Algorithms can be expressed using natural languages, pseudocode, flowcharts, etc.
Pseudocode
Pseudocode is one of the methods that could be used to represent an algorithm. It is not written in a specific syntax that is used by a programming language and therefore cannot be executed in a computer. There are lots of formats used for writing pseudocodes and most of them borrow some of the structures from popular programming languages such as C, Lisp, FORTRAN, etc. Also, natural language is used when presenting details that are not important. Most of the algorithms are presented using pseudocode since they can be read and understood using programmers who are familiar with different programming languages. Some languages such as Pascal have syntax that is very similar to pseudocode making the transformation from pseudocode to the corresponding program code easier. Pseudocode allows to include control structures such as WHILE, IF-THEN-ELSE, REPEAT-UNTIL, FOR, and CASE, which are present in many high level languages.
What is the difference between Algorithm and Pseudocode?
An algorithm is a well defined sequence of steps that provides a solution for a given problem, while a pseudocode is one of the methods that can be used to represent an algorithm. While algorithms can be written in natural language, pseudocode is written in a format that is closely related to high level programming language structures. But pseudocode does not use specific programming language syntax and therefore could be understood by programmers who are familiar with different programming languages. Additionally, transforming an algorithm presented in pseudocode to programming code could be much easier than converting an algorithm written in natural language.

Pseudocode Example:


If student's grade is greater than or equal to 60
    Print "passed"
else
    Print "failed"

Tuesday, 17 March 2015

PL/SQL

PL/SQL is a combination of SQL along with the procedural features of programming languages. It was developed by Oracle Corporation in the early 90's to enhance the capabilities of SQL.

PL/SQL is one of three key programming languages embedded in the Oracle Database, along with SQL itself and Java.

The PL/SQL programming language was developed by Oracle Corporation in the late 1980s as procedural extension language for SQL and the Oracle relational database. Following are notable facts about PL/SQL:


  • PL/SQL is a completely portable, high-performance transaction-processing language.
  • PL/SQL provides a built-in interpreted and OS independent programming environment.
  • PL/SQL can also directly be called from the command-line SQL*Plus interface.
  • Direct call can also be made from external programming language calls to database.
  • PL/SQL's general syntax is based on that of ADA and Pascal programming language.
  • Apart from Oracle, PL/SQL is available in TimesTen in-memory database and IBM DB2.


Features of PL/SQL:

PL/SQL has the following features:


  • PL/SQL is tightly integrated with SQL.
  • It offers extensive error checking.
  • It offers numerous data types.
  • It offers a variety of programming structures.
  • It supports structured programming through functions and procedures.
  • It supports object-oriented programming.
  • It supports developing web applications and server pages.


Advantages of PL/SQL:

PL/SQL has the following advantages:


  • SQL is the standard database language and PL/SQL is strongly integrated with SQL. PL/SQL supports both static and dynamic SQL. Static SQL supports DML operations and transaction control from PL/SQL block. Dynamic SQL is SQL allows embedding DDL statements in PL/SQL blocks.
  • PL/SQL allows sending an entire block of statements to the database at one time. This reduces network traffic and provides high performance for the applications.
  • PL/SQL gives high productivity to programmers as it can query, transform, and update data in a database.
  • PL/SQL saves time on design and debugging by strong features, such as exception handling, encapsulation, data hiding, and object-oriented data types.
  • Applications written in PL/SQL are fully portable.
  • PL/SQL provides high security level.
  • PL/SQL provides access to predefined SQL packages.
  • PL/SQL provides support for Object-Oriented Programming.
  • PL/SQL provides support for Developing Web Applications and Server Pages.



PL/SQL - Environment Setup:

PL/SQL is not a stand-alone programming language; it is a tool within the Oracle programming environment. SQL* Plus is an interactive tool that allows you to type SQL and PL/SQL statements at the command prompt. These commands are then sent to the database for processing. Once the statements are processed, the results are sent back and displayed on screen.

To run PL/SQL programs, you should have Oracle RBDMS Server installed in your machine which will take care of executing SQL commands. 

Tuesday, 24 February 2015

Dimension Table

What is Dimension Table?


Dimension table is one that describes the business entities of an enterprise, represented as hierarchical, categorical information such as time, departments, locations, and products. Dimension tables are sometimes called lookup or reference tables.

Location Dimension

In a relational data modeling, for normalization purposes, country lookup, state lookup, county lookup, and city lookups are not merged as a single table. In a dimensional data modeling (star schema), these tables would be merged as a single table called LOCATION DIMENSION for performance and slicing data requirements. This location dimension helps to compare the sales in one region with another region. We may see good sales profit in one region and loss in another region. If it is a loss, the reasons for that may be a new competitor in that area, or failure of our marketing strategy etc.

Example of Location Dimension:

Country Lookup

Country Code Country Name DateTimeStamp
USA United States Of America 1/1/2005 11:23:31 AM

State Lookup

State Code State Name DateTimeStamp
NY New York 1/1/2005 11:23:31 AM
FL Florida 1/1/2005 11:23:31 AM
CA California 1/1/2005 11:23:31 AM
NJ New Jersey 1/1/2005 11:23:31 AM

County Lookup

County Code County Name DateTimeStamp
NYSH Shelby 1/1/2005 11:23:31 AM
FLJE Jefferson 1/1/2005 11:23:31 AM
CAMO Montgomery 1/1/2005 11:23:31 AM
NJHU Hudson 1/1/2005 11:23:31 AM

City Lookup

City Code City Name DateTimeStamp
NYSHMA Manhattan 1/1/2005 11:23:31 AM
FLJEPC Panama City 1/1/2005 11:23:31 AM
CAMOSH San Hose 1/1/2005 11:23:31 AM
NJHUJC Jersey City 1/1/2005 11:23:31 AM

Location Dimension

Location
Dimension Id Country
Name State
Name County
Name City
Name DateTime
Stamp

1 USA New York Shelby Manhattan 1/1/2005 11:23:31 AM
2 USA Florida Jefferson Panama City 1/1/2005 11:23:31 AM
3 USA California Montgomery San Hose 1/1/2005 11:23:31 AM
4 USA New Jersey Hudson Jersey City 1/1/2005 11:23:31 AM

Product Dimension

          In a relational data model, for normalization purposes, product category lookup, product sub-category lookup, product lookup, and and product feature lookups are are not merged as a single table. In a dimensional data modeling(star schema), these tables would be merged as a single table called PRODUCT DIMENSION for performance and slicing data requirements.

Product Category Lookup

Product Category Code Product Category Name DateTimeStamp
1 Apparel 1/1/2005 11:23:31 AM
2 Shoe 1/1/2005 11:23:31 AM
Product Sub-Category Lookup

Product
Sub-Category Code Product
Sub-Category Name DateTime
Stamp
11 Shirt 1/1/2005 11:23:31 AM
12 Trouser 1/1/2005 11:23:31 AM
13 Casual 1/1/2005 11:23:31 AM
14 Formal 1/1/2005 11:23:31 AM
Product Lookup
Product Code Product Name DateTimeStamp
1001 Van Heusen 1/1/2005 11:23:31 AM
1002 Arrow 1/1/2005 11:23:31 AM
1003 Nike 1/1/2005 11:23:31 AM
1004 Adidas 1/1/2005 11:23:31 AM
Product Feature Lookup
Product Feature Code Product Feature Description DateTimeStamp
10001 Van-M 1/1/2005 11:23:31 AM
10002 Van-L 1/1/2005 11:23:31 AM
10003 Arr-XL 1/1/2005 11:23:31 AM
10004 Arr-XXL 1/1/2005 11:23:31 AM
10005 Nike-8 1/1/2005 11:23:31 AM
10006 Nike-9 1/1/2005 11:23:31 AM
10007 Adidas-10 1/1/2005 11:23:31 AM
10008 Adidas-11 1/1/2005 11:23:31 AM
Product Dimension
Product Dimension Id Product Category Name Product Sub-Category Name Product Name Product Feature Desc DateTime
Stamp
100001 Apparel Shirt Van Heusen Van-M 1/1/2005 11:23:31 AM
100002 Apparel Shirt Van Heusen Van-L 1/1/2005 11:23:31 AM
100003 Apparel Shirt Arrow Arr-XL 1/1/2005 11:23:31 AM
100004 Apparel Shirt Arrow Arr-XXL 1/1/2005 11:23:31 AM
100005 Shoe Casual Nike Nike-8 1/1/2005 11:23:31 AM
100006 Shoe Casual Nike Nike-9 1/1/2005 11:23:31 AM
100007 Shoe Casual Adidas Adidas-10 1/1/2005 11:23:31 AM
100008 Shoe Casual Adidas Adidas-11 1/1/2005 11:23:31 AM


Organization Dimension

In a relational data model, for normalization purposes, corporate office lookup, region lookup, branch lookup, and employee lookups are not merged as a single table. In a dimensional data modeling(star schema), these tables would be merged as a single table called ORGANIZATION DIMENSION for performance and slicing data.
This dimension helps us to find the products sold or serviced within the organization by the employees. In any industry, we can calculate the sales on region basis, branch basis and employee basis. Based on the performance, an organization can provide incentives to employees and subsidies to the branches to increase further sales.

Corporate Lookup
Corporate Code Corporate Name DateTimeStamp
CO American Bank 1/1/2005 11:23:31 AM
Region Lookup
Region Code Region Name DateTimeStamp
SE South East 1/1/2005 11:23:31 AM
MW Mid West 1/1/2005 11:23:31 AM
Branch Lookup
Branch Code Branch Name DateTimeStamp
FLTM Florida-Tampa 1/1/2005 11:23:31 AM
ILCH Illinois-Chicago 1/1/2005 11:23:31 AM
Employee Lookup
Employee Code Employee Name DateTimeStamp
E1 Paul Young 1/1/2005 11:23:31 AM
E2 Chris Davis 1/1/2005 11:23:31 AM
Organization Dimension
Organization Dimension Id Corporate Name Region Name Branch Name Employee Name DateTime
Stamp
1 American Bank South East Florida-Tampa Paul Young 1/1/2005 11:23:31 AM
2 American Bank Mid West Illinois-Chicago Chris Davis 1/1/2005 11:23:31 AM

Time Dimension
In a relational data model, for normalization purposes, year lookup, quarter lookup, month lookup, and week lookups are not merged as a single table. In a dimensional data modeling(star schema), these tables would be merged as a single table called TIME DIMENSION for performance and slicing data.

This dimensions helps to find the sales done on date, weekly, monthly and yearly basis. We can have a trend analysis by comparing this year sales with the previous year or this week sales with the previous week.

Year Lookup

Year Id Year Number DateTimeStamp
1 2004 1/1/2005 11:23:31 AM
2 2005 1/1/2005 11:23:31 AM

Quarter Lookup

Quarter Number Quarter Name DateTimeStamp
1 Q1 1/1/2005 11:23:31 AM
2 Q2 1/1/2005 11:23:31 AM
3 Q3 1/1/2005 11:23:31 AM
4 Q4 1/1/2005 11:23:31 AM

Month Lookup

Month Number Month Name DateTimeStamp
1 January 1/1/2005 11:23:31 AM
2 February 1/1/2005 11:23:31 AM
3 March 1/1/2005 11:23:31 AM
4 April 1/1/2005 11:23:31 AM
5 May 1/1/2005 11:23:31 AM
6 June 1/1/2005 11:23:31 AM
7 July 1/1/2005 11:23:31 AM
8 August 1/1/2005 11:23:31 AM
9 September 1/1/2005 11:23:31 AM
10 October 1/1/2005 11:23:31 AM
11 November 1/1/2005 11:23:31 AM
12 December 1/1/2005 11:23:31 AM

Week Lookup

Week Number Day of Week DateTimeStamp
1 Sunday 1/1/2005 11:23:31 AM
1 Monday 1/1/2005 11:23:31 AM
1 Tuesday 1/1/2005 11:23:31 AM
1 Wednesday 1/1/2005 11:23:31 AM
1 Thursday 1/1/2005 11:23:31 AM
1 Friday 1/1/2005 11:23:31 AM
1 Saturday 1/1/2005 11:23:31 AM
2 Sunday 1/1/2005 11:23:31 AM
2 Monday 1/1/2005 11:23:31 AM
2 Tuesday 1/1/2005 11:23:31 AM
2 Wednesday 1/1/2005 11:23:31 AM
2 Thursday 1/1/2005 11:23:31 AM
2 Friday 1/1/2005 11:23:31 AM
2 Saturday 1/1/2005 11:23:31 AM

Time Dimension

Time Dim Id Year No Day of Year Quarter No Month No Month Name Month Day No Week No Day of Week Cal Date DateTime
Stamp
1 2004 1 Q1 1 January 1 1 5 1/1/2004 1/1/2005 11:23:31 AM
2 2004 32 Q1 2 February 1 5 1 2/1/2004 1/1/2005 11:23:31 AM
3 2005 1 Q1 1 January 1 1 7 1/1/2005 1/1/2005 11:23:31 AM
4 2005 32 Q1 2 February 1 5 3 2/1/2005 1/1/2005 11:23:31 AM

Fact Table

What is Fact Table?


The centralized table in a star schema is called as FACT table. A fact table typically has two types of columns: those that contain facts and those that are foreign keys to dimension tables. The primary key of a fact table is usually a composite key that is made up of all of its foreign keys.

 "Sales Dollar" is a fact(measure) and it can be added across several dimensions. Fact tables store different types of measures like additive, non additive and semi additive measures.

Measure Types

• Additive - Measures that can be added across all dimensions.
• Non Additive - Measures that cannot be added across all dimensions.
• Semi Additive - Measures that can be added across few dimensions and not with others.
A fact table might contain either detail level facts or facts that have been aggregated (fact tables that contain aggregated facts are often instead called summary tables).
In the real world, it is possible to have a fact table that contains no measures or facts. These tables are called as Factless Fact tables.

Steps in designing Fact Table

 Identify a business process for analysis (like sales).
• Identify measures or facts (sales dollar).
• Identify dimensions for facts (product dimension, location dimension, time dimension,     organization dimension).
• List the columns that describe each dimension.(region name, branch name, region name).
• Determine the lowest level of summary in a fact table (sales dollar).
for a product in a year within a location sold or serviced by an employee

Monday, 2 February 2015

Windows Script to Schedule a Task

SCHTASKS /parameter [arguments]

Description:

Enables an administrator to create, delete, query, change, run and
end scheduled tasks on a local or remote system.

Parameter List:

/Create     Creates a new scheduled task.

/Delete     Deletes the scheduled task(s).

/Query     Displays all scheduled tasks.

/Change     Changes the properties of scheduled task.

/Run         Runs the scheduled task on demand.

/End         Stops the currently running scheduled task.

/ShowSid Shows the security identifier corresponding to a scheduled t
ask name.

/?             Displays this help message.

SCHTASKS /Create:

SCHTASKS /Create [/S system [/U username [/P [password]]]] [/RU username [/RP password]] /SC schedule [/MO modifier] [/D day] [/M months] [/I idletime] /TN taskname /TR taskrun [/ST starttime] [/RI interval] [ {/ET endtime | /DU duration} [/K] [/XML xmlfile] [/V1]] [/SD startdate] [/ED enddate] [/IT | /NP] [/Z] [/F]

Description:
Enables an administrator to create scheduled tasks on a local or remote system.

Parameter List:
/S     system     Specifies the remote system to connect to. If omitted the system parameter defaults to the local system.

/U     username Specifies the user context under which SchTasks.exe should execute.

/P [password]     Specifies the password for the given user context. Prompts for input if omitted.

/RU     username     Specifies the "run as" user account (user context) under which the task runs. For the system account, valid values are "", "NT AUTHORITY\SYSTEM" or "SYSTEM". For v2 tasks, "NT AUTHORITY\LOCALSERVICE" and "NT AUTHORITY\NETWORKSERVICE" are also available as well as the well known SIDs for all three.

/RP     [password]     Specifies the password for the "run as" user. To prompt for the password, the value must be either "*" or none. This password is ignored for the system account. Must be combined with either /RU or /XML switch.

/SC     schedule     Specifies the schedule frequency. Valid schedule types: MINUTE, HOURLY, DAILY, WEEKLY, MONTHLY, ONCE, ONSTART, ONLOGON, ONIDLE, ONEVENT.

/MO     modifier     Refines the schedule type to allow finer control over schedule recurrence. Valid values are listed in the "Modifiers" section below.

/D     days         Specifies the day of the week to run the task. Valid values: MON, TUE, WED, THU, FRI, SAT, SUN and for MONTHLY schedules 1 - 31 (days of the month). Wildcard "*" specifies all days.

/M     months         Specifies month(s) of the year. Defaults to the first day of the month. Valid values: JAN, FEB, MAR, APR, MAY, JUN, JUL, AUG, SEP, OCT, NOV, DEC. Wildcard "*" specifies all months.

/I     idletime         Specifies the amount of idle time to wait before running a scheduled ONIDLE task.
Valid range: 1 - 999 minutes.

/TN     taskname         Specifies a name which uniquely identifies this scheduled task.

/TR         taskrun         Specifies the path and file name of the program to be run at the scheduled time. Example: C:\windows\system32\calc.exe

/ST     starttime             Specifies the start time to run the task. The time format is HH:mm (24 hour time) for example, 14:30 for 2:30 PM. Defaults to current time if /ST is not specified. This option is required with /SC ONCE.

/RI     interval             Specifies the repetition interval in minutes. This is not applicable for schedule types: MINUTE, HOURLY, ONSTART, ONLOGON, ONIDLE, ONEVENT.
Valid range: 1 - 599940 minutes. If either /ET or /DU is specified, then it defaults to 10 minutes.

/ET     endtime         Specifies the end time to run the task. The time format is HH:mm (24 hour time) for example, 14:50 for 2:50 PM. This is not applicable for schedule types: ONSTART, ONLOGON, ONIDLE, ONEVENT.

/DU     duration         Specifies the duration to run the task. The time format is HH:mm. This is not applicable with /ET and for schedule types: ONSTART, ONLOGON, ONIDLE, ONEVENT. For /V1 tasks, if /RI is specified, duration defaults to 1 hour.

/K                             Terminates the task at the endtime or duration time. This is not applicable for schedule types: ONSTART, ONLOGON, ONIDLE, ONEVENT. Either /ET or /DU must be specified.

/SD     startdate         Specifies the first date on which the task runs. The format is dd/mm/yyyy. Defaults to the current date. This is not applicable for schedule types: ONCE, ONSTART, ONLOGON, ONIDLE, ONEVENT.

/ED     enddate         Specifies the last date when the task should run. The format is dd/mm/yyyy. This is not applicable for schedule types: ONCE, ONSTART, ONLOGON, ONIDLE, ONEVENT.

/EC     ChannelName         Specifies the event channel for OnEvent triggers.

/IT                        Enables the task to run interactively only if the /RU user is currently logged on at the time the job runs. This task runs only if the user is logged in.

/NP                      No password is stored. The task runs non-interactively as the given user. Only local resources are available.

/Z                         Marks the task for deletion after its final run.

/XML     xmlfile     Creates a task from the task XML specified in a file. Can be combined with /RU and /RP switches, or with /RP alone, when task XML already contains the principal.

/V1                       Creates a task visible to pre-Vista platforms. Not compatible with /XML.

/F                         Forcefully creates the task and suppresses warnings if the specified task already exists.

/RL     level           Sets the Run Level for the job. Valid values are LIMITED and HIGHEST. The default is LIMITED.

/DELAY     delaytime     Specifies the wait time to delay the running of the task after the trigger is fired. The time format is mmmm:ss. This option is only valid for schedule types ONSTART, ONLOGON, ONEVENT.

/?                         Displays this help message.
Modifiers: Valid values for the /MO switch per schedule type:
MINUTE: 1 - 1439 minutes.
HOURLY: 1 - 23 hours.
DAILY: 1 - 365 days.
WEEKLY: weeks 1 - 52.
ONCE: No modifiers.
ONSTART: No modifiers.
ONLOGON: No modifiers.
ONIDLE: No modifiers.
MONTHLY: 1 - 12, or
FIRST, SECOND, THIRD, FOURTH, LAST, LASTDAY.

ONEVENT: XPath event query string.

Examples:
  • Creates a scheduled task "doc" on the remote machine "ABC" which runs notepad.exe every hour under user "runasuser".

    SCHTASKS /Create /S ABC /U user /P password /RU runasuser /RP runaspassword /SC HOURLY /TN doc /TR notepad
     
  • Creates a scheduled task "accountant" on the remote machine "ABC" to run calc.exe every five minutes from the specified start time to end time between the start date and end date.

    SCHTASKS /Create /S ABC /U domain\user /P password /SC MINUTE /MO 5 /TN accountant /TR calc.exe /ST 12:00 /ET 14:00 /SD 06/06/2006 /ED 06/06/2006 /RU runasuser /RP userpassword
     
  • Creates a scheduled task "gametime" to run freecell on the first Sunday of every month.

    SCHTASKS /Create /SC MONTHLY /MO first /D SUN /TN gametime /TR c:\windows\system32\freecell
     
  • Creates a scheduled task "report" on remote machine "ABC" to run notepad.exe every week.

    SCHTASKS /Create /S ABC /U user /P password /RU runasuser /RP runaspassword /SC WEEKLY /TN report /TR notepad.exe
     
  • Creates a scheduled task "logtracker" on remote machine "ABC" to run notepad.exe every five minutes starting from the specified start time with no end time. The /RP password will be prompted for.

    SCHTASKS /Create /S ABC /U domain\user /P password /SC MINUTE /MO 5 /TN logtracker /TR c:\windows\system32\notepad.exe /ST 18:30 /RU runasuser /RP
     
  • Creates a scheduled task "gaming" to run freecell.exe starting at 12:00 and automatically terminating at 14:00 hours every day

    SCHTASKS /Create /SC DAILY /TN gaming /TR c:\freecell /ST 12:00 /ET 14:00 /K
     
  • Creates a scheduled task "EventLog" to run wevtvwr.msc starting whenever event 101 is published in the System channel

    SCHTASKS /Create /TN EventLog /TR wevtvwr.msc /SC ONEVENT /EC System /MO *[System/EventID=101]
     
  • Spaces in file paths can be used by using two sets of quotes, one set for CMD.EXE and one for SchTasks.exe. The outer quotes for CMD need to be double quotes; the inner quotes can be single quotes or escaped double quotes:

    SCHTASKS /Create /tr "'c:\program files\internet explorer\iexplorer.exe' \"c:\log data\today.xml\"" ...

SCHTASKS /Delete:

SCHTASKS /Delete [/S system [/U username [/P [password]]]]
/TN taskname [/F]

Description:
Deletes one or more scheduled tasks.

Parameter List:
/S     system     Specifies the remote system to connect to. If omitted the system parameter defaults to the local system.

/U     username Specifies the user context under which SchTasks.exe should execute.

/P [password]     Specifies the password for the given user context. Prompts for input if omitted.

/TN     taskname     Specifies the name of the scheduled task to delete. Wildcard "*" may be used to delete all tasks.

/F                         Forcefully deletes the task and suppresses warnings if the specified task is currently running.

/?                         Displays this help message.

Examples
SCHTASKS /Delete /TN * /F
SCHTASKS /Delete /TN "Backup and Restore"
SCHTASKS /Delete /S system /U user /P password /TN "Start Restore"
SCHTASKS /Delete /S system /U user /P password /TN "Start Backup" /F

SCHTASKS /Query:

SCHTASKS /Query [/S system [/U username [/P [password]]]]
[/FO format | /XML [xml_type]] [/NH] [/V] [/TN taskname] [/?]

Description:
Enables an administrator to display the scheduled tasks on the local or remote system.

Parameter List:
/S     system     Specifies the remote system to connect to. If omitted the system parameter defaults to the local system.

/U     username Specifies the user context under which SchTasks.exe should execute.

/P [password]     Specifies the password for the given user context. Prompts for input if omitted.

/FO     format  Specifies the format for the output. Valid values: TABLE, LIST, CSV.

/NH                 Specifies that the column header should not be displayed in the output. This is valid only for TABLE format.

/V                     Displays verbose task output.

/TN     taskname        Specifies the task name for which to retrieve the information, else all of them.

/XML [xml_type]       Displays task definitions in XML format.

If xml_type is ONE then the output will be one valid XML file.

If xml_type is not present then the output will be the concatenation of all XML task definitions.

/?                     Displays this help message.

Examples
SCHTASKS /Query
SCHTASKS /Query /?
SCHTASKS /Query /S system /U user /P password
SCHTASKS /Query /FO LIST /V /S system /U user /P password
SCHTASKS /Query /FO TABLE /NH /V

SCHTASKS /Change:

SCHTASKS /Change [/S system [/U username [/P [password]]]] /TN taskname { [/RU runasuser] [/RP runaspassword] [/TR taskrun] [/ST starttime] [/RI interval] [ {/ET endtime | /DU duration} [/K] ] [/SD startdate] [/ED enddate] [/ENABLE | /DISABLE] [/IT] [/Z] }

Description:
Changes the program to run, or user account and password used by a scheduled task.

Parameter List:
/S     system     Specifies the remote system to connect to. If omitted the system parameter defaults to the local system.

/U     username Specifies the user context under which SchTasks.exe should execute.

/P [password]     Specifies the password for the given user context. Prompts for input if omitted.
/TN     taskname     Specifies which scheduled task to change.

/RU     username     Specifies the "run as" user account (user context) under which the task runs. For the system account, valid values are "", "NT AUTHORITY\SYSTEM" or "SYSTEM". For v2 tasks, "NT AUTHORITY\LOCALSERVICE" and "NT AUTHORITY\NETWORKSERVICE" are also available as well as the well known SIDs for all three.

/RP     [password]     Specifies the password for the "run as" user. To prompt for the password, the value must be either "*" or none. This password is ignored for the system account. Must be combined with either /RU or /XML switch.
/ST     starttime         Specifies the start time to run the task. The time format is HH:mm (24 hour time) for example, 14:30 for 2:30 PM.

/RI     interval            Specifies the repetition interval in minutes. Valid range: 1 - 599940 minutes.

/ET     endtime           Specifies the end time to run the task. The time format is HH:mm (24 hour time) for example, 14:50 for 2:50 PM.

/DU    duration          Specifies the duration to run the task. The time format is HH:mm. This is not applicable with /ET.

/K                             Terminates the task at the endtime or duration time.

/SD     startdate         Specifies the first date on which the task runs. The format is dd/mm/yyyy.

/ED     enddate          Specifies the last date when the task should run. The format is dd/mm/yyyy.

/IT                            Enables the task to run interactively only if the /RU user is currently logged on at the time the job runs. This task runs only if the user is logged in.

/RL     level               Sets the Run Level for the job. Valid values are LIMITED and HIGHEST. The default is to not change it.
/ENABLE                Enables the scheduled task.

/DISABLE               Disables the scheduled task.

/Z                             Marks the task for deletion after its final run.

/DELAY     delaytime     Specifies the wait time to delay the running of the task after the trigger is fired. The time format is mmmm:ss. This option is only valid for schedule types ONSTART, ONLOGON, ONEVENT.

/?                            Displays this help message.

Examples:
SCHTASKS /Change /RP password /TN "Backup and Restore"
SCHTASKS /Change /TR restore.exe /TN "Start Restore"
SCHTASKS /Change /S system /U user /P password /RU newuser /TN "Start Backup" /IT

SCHTASKS /Run:

SCHTASKS /Run [/S system [/U username [/P [password]]]] [/I] /TN taskname

Description:
Runs a scheduled task on demand.

Parameter List:
/S     system     Specifies the remote system to connect to. If omitted the system parameter defaults to the local system.

/U     username Specifies the user context under which SchTasks.exe should execute.

/P [password]     Specifies the password for the given user context. Prompts for input if omitted.
/I                         Runs the task immediately by ignoring any constraint.

/TN     taskname     Identifies the scheduled task to run now.

/?                         Displays this help message.

Examples:
SCHTASKS /Run /?
SCHTASKS /Run /TN "Start Backup"
SCHTASKS /Run /S system /U user /P password /I /TN "Backup and Restore"

SCHTASKS /End
SCHTASKS /End [/S system [/U username [/P [password]]]] /TN taskname

Description:
Stops a running scheduled task.

Parameter List:
/S     system     Specifies the remote system to connect to. If omitted the system parameter defaults to the local system.

/U     username Specifies the user context under which SchTasks.exe should execute.

/P [password]     Specifies the password for the given user context. Prompts for input if omitted.
/TN     taskname     Identifies the scheduled task to run now.

/?                         Displays this help message.
Examples:
SCHTASKS /End /?
SCHTASKS /End /TN "Start Backup"
SCHTASKS /End /S system /U user /P password /TN "Backup and Restore"



Ref :


http://dosprompt.info/commands/schtasks.asphttp://dosprompt.info/commands/schtasks.asp
http://www.microsoft.com/resources/documentation/windows/xp/all/proddocs/en-us/schtasks.mspx?mfr=true


Example :


SchTasks /Create /SC DAILY /TN "MyName" /TR "C:\ABC\ABC.bat" /ST 11:00 /RI 60 /DU 04:00 /RL LIMITED /K



DBT - Models

Models are where your developers spend most of their time within a dbt environment. Models are primarily written as a select statement and ...