DATA ANALYTICS
CITATIONS IN THE EMBEDS BELOW
CITATIONS IN THE EMBEDS BELOW
Today, organizations that are using data to uncover opportunities and are applying that knowledge to differentiate themselves are the ones leading into the future. Whether looking for patterns in financial transactions to detect fraud, using recommendation engines to drive conversion, mining, social media posts for customer voice or brands personalizing their offers based on customer behavior analysis, business leaders realized that data holds the key to competitive advantage. To get value from data, you need a vast number of skill sets and people playing different roles.
In this video, we're going to look at the role data engineers, data analysts, data scientists, business analysts, and business intelligence or BI analysts play in helping organizations tap into vast amounts of data and turn them into actionable insights. It all starts with a data engineer. Data engineers are people who develop and maintain data architectures and make data available for business operations and analysis.
Data engineers work within the data ecosystem to extract, integrate, and organize data from disparate sources. Clean transform and prepare data design, store and manage data in data repositories. They enabled data to be accessible in formats and systems that the various business applications as well as stakeholders like data analysts and data scientists can utilize. A data engineer must have good knowledge of programming, sound knowledge of systems and technology architectures, and in depth understanding of relational databases and non-relational data stores. Now let's look at the role of a data analyst.
In short, a data analyst translates data and numbers into plain language, so organizations can make decisions, data analysts inspect and clean data for deriving insights, identify correlations, find patterns, and apply statistical methods to. Analyze and mined data and visualize data to interpret and present the findings of data analysis. Analysts are the people who answer questions such as, Are the users search experiences generally good or bad with the search functionality on our site? or What is the popular perception of people regarding our rebranding initiatives? Or is there a correlation between sales, and one product and another? Data analysts require good knowledge of spreadsheets, writing queries, and using statistical tools to create charts and dashboards. Modern data analysts also need to have some programming skills.
They also need strong analytical and storytelling skills. And now let's look at the role data scientists play in this ecosystem. Data scientists analyze data for actionable insights and build machine learning or deep learning models that train on past data to create predictive models. Data scientists are people who answer questions such as, How many new social media followers am I likely to get next month, or what percentage of my customers am I likely to lose to competition in the next quarter, or is this financial transaction unusual for this customer? Data scientists require knowledge of mathematics, statistics, and a fair understanding of programming languages, databases, and building data models. They also need to have domain knowledge.
Then we also have business analysts and BI analysts. Business analysts leverage the work of data analysts and data scientists to look at possible implications for their business and the actions they need to take or recommend. BI analysts do the same except. Their focus is on the market forces and external influences that shape their business.
They provide business intelligent solutions by organizing and monitoring data on different business functions and exploring that data to extract insights and actionables that improve business performance. To summarize, in simple terms, data engineering converts raw data into usable data. Data analytics uses this data to generate insights. Data scientists use data analytics and data engineering to predict the future using data from the past, business analysts and business intelligence analysts use these insights and predictions to drive decisions that benefit and grow their business. Interestingly, it's not uncommon for data professionals to start their career in one of the data roles and transition to another role within the data ecosystem by supplementing their skills - IBM DATA ANALYTICS COURSE / COURSERA 2021
A data analyst ecosystem includes the infrastructure, software, tools, frameworks, and processes used to gather, clean, analyze, mine, and visualize data.
Based on how well-defined the structure of the data is, data can be categorized as:
Structured Data, that is data which is well organized in formats that can be stored in databases.
Semi-Structured Data, that is data which is partially organized and partially free form.
Unstructured Data, that is data which can not be organized conventionally into rows and columns.
Data comes in a wide-ranging variety of file formats, such as delimited text files, spreadsheets, XML, PDF, and JSON, each with its own list of benefits and limitations of use.
Data is extracted from multiple data sources, ranging from relational and non-relational databases to APIs, web services, data streams, social platforms, and sensor devices.
Once the data is identified and gathered from different sources, it needs to be staged in a data repository so that it can be prepared for analysis. The type, format, and sources of data influence the type of data repository that can be used.
Data professionals need a host of languages that can help them extract, prepare, and analyze data. These can be classified as:
Querying languages, such as SQL, used for accessing and manipulating data from databases.
Programming languages such as Python, R, and Java, for developing applications and controlling application behavior.
Shell and Scripting languages, such as Unix/Linux Shell, and PowerShell, for automating repetitive operational tasks.
IBM DATA ANALYTICS COURSE - COURSERA 2021 - COPYRIGHT
The role of a Data Analyst spans across:
Acquiring data that best serves the use case.
Preparing and analyzing data to understand what it represents.
Interpreting and effectively communicating the message to stakeholders who need to act on the findings.
Ensuring that the process is documented for future reference and repeatability.
In order to play this role successfully, Data Analysts need a mix of technical, functional, and soft skills.
Technical Skills include varying levels of proficiency in using spreadsheets, statistical tools, visualization tools, programming and querying languages, and the ability to work with different types of data repositories and big data platforms.
An understanding of Statistics, Analytical techniques, problem-solving, the ability to probe a situation from multiple perspectives, data visualization, and project management skills – all of which come under Functional Skills a Data Analyst needs in order to play an effective role.
Soft Skills include the ability to work collaboratively, communicate effectively, tell a compelling story with data, and garner support and buy-in from stakeholders. Curiosity to explore different pathways and intuition that helps to give a sense of the future based on past experiences are also essential skills for being a good Data Analyst.
COURSERA COPY RIGHT 2021 - IBM DATA ANALYTICS COURSE
Data analysis is the process of gathering, cleaning, analyzing and mining data, interpreting results, and reporting the findings. With data analysis we find patterns within data and correlations between different data points. And it is through these patterns and correlations that insights are generated, and conclusions are drawn. Data analysis helps businesses understand their past performance and informs their decision-making for future actions. Using data analysis, businesses can validate a course of action before committing to it. Saving valuable time and resources and also ensuring greater success. We will explore four primary types of data analysis, each with a different goal and place in the data analysis process. Descriptive Analytics helps answer questions about what happened over a given period of time by summarizing past data and presenting the findings to stakeholders. It helps provide essential insights into past events. For example, tracking past performance based on the organization's key performance indicators or cash flow analysis. Diagnostic analytics helps answer the question. Why did it happen? It takes the insights from descriptive analytics to dig deeper to find the cause of the outcome. For example, a sudden change in traffic to a website without an obvious cause or an increase in sales in a region where there has been no change in marketing. Predictive analytics helps answer the question, What will happen next? Historical data and trends are used to predict future outcomes. Some of the areas in which businesses apply predictive analysis are risk assessment and sales forecasts. It's important to note that the purpose of predictive analytics is not. to say what will happen in the future, it's objective is to forecast what might happen in the future. All predictions are probabilistic in nature. Prescriptive Analytics helps answer the question, What should be done about it? By analyzing past decisions and events, the likelihood of different outcomes. Is estimated on the basis of which a course of action is decided. Self-driving cars are a good example of Prescriptive Analytics. They analyze the environment to make decisions regarding speed, changing lanes, which route to take, etc. Or airlines automatically adjusting ticket prices based on customer demand. Gas prices, the weather or traffic on connecting routes. Now let's look at some of the key steps in any data analysis process. Understanding the problem and desired result. Data analysis begins with understanding the problem that needs to be solved and the desired outcome that needs to be achieved. Where you are and where you want to be needs to be clearly defined before the analysis process can begin. Setting a clear metric. This stage of the process includes deciding what will be measured. For example, number of product X sold in a region and how it will be measured, for example. In a quarter or during a festival season, gathering data once you know what you're going to measure and how you're going to measure it, you identify the data you require, the data sources you need to pull this data from, and the best tools for the job. Cleaning data. Having gathered the data, the next step is to fix quality issues in the data that could affect the accuracy of the analysis. This is a critical step because the accuracy of the analysis can only be ensured if the data is clean. You will clean the data for missing or incomplete values and outliers. For example, a customer demographics data in which the age field has a value of 150 is an outlier. You will also standardize the data coming in from multiple sources. Analyzing and mining data. Once the data is clean, you will extract and analyze the data from different perspectives. You may need to manipulate your data in several different ways to understand the trends, identify correlations and find patterns and variations. Interpreting results. After analyzing your data and possibly conducting further research, which can be an iterative loop, it's time to interpret your results. As you interpret your results, you need to evaluate if your analysis is defendable against objections, and if there are any limitations or circumstances under which your analysis may not hold true. Presenting your findings. Ultimately, the goal of any analysis is to impact decision making. The ability to communicate and present your findings in clear and impactful ways is as important a part of the data analysis process as is the analysis itself. Reports, dashboards, charts, graphs, maps, case studies are just some of the ways in which you can present your data. - IBM DATA ANALYTICS COURSE / COURSERA @ COPYRIGHT 2021
While the role of a Data Analyst varies depending on the type of organization and the extent to which it has adopted data-driven practices, there are some responsibilities that are typical to a Data Analyst role in today’s organizations. These include: Acquiring data from primary and secondary data sources, Creating queries to extract required data from databases and other data collection systems, Filtering, cleaning, standardizing, and reorganizing data in preparation for data analysis, Using statistical tools to interpret data sets, Using statistical techniques to identify patterns and correlations in data, Analyzing patterns in complex data sets and interpreting trends, Preparing reports and charts that effectively communicate trends and patterns, Creating appropriate documentation to define and demonstrate the steps of the data analysis process. Corresponding to these responsibilities, let’s look at some of the skills that are valuable for a Data Analyst. The data analysis process requires a combination of technical, functional, and soft skills. Let’s first look at some of the technical skills that you need in your role as a Data Analyst. These include: Expertise in using spreadsheets such as Microsoft Excel or Google Sheets, Proficiency in statistical analysis and visualization tools and software such as IBM Cognos, IBM SPSS, Oracle Visual Analyzer, Microsoft Power BI, SAS, and Tableau Proficiency in at least one of the programming languages such as R, Python, and in some cases C++, Java, and MATLAB, Good knowledge of SQL, and ability to work with data in relational and NoSQL databases, The ability to access and extract data from data repositories such as data marts, data warehouses, data lakes, and data pipelines, Familiarity with Big Data processing tools such as Hadoop, Hive, and Spark. We will understand more about the features and use cases of some of these programming languages, databases, data repositories, and big data processing tools further along in the course. Now we’ll look at some of the functional skills that you require for the role of Data Analyst. These include: Proficiency in Statistics to help you analyze your data, validate your analysis, and identify fallacies and logical errors. Analytical skills that help you research and interpret data, theorize, and make forecasts. Problem-solving skills, because ultimately, the end-goal of all data analysis is to solve problems. Probing skills that are essential for the discovery process, that is, for understanding a problem from the perspective of varied stakeholders and users—because the data analysis process really begins with a clear articulation of the problem statement and desired outcome. Data Visualization skills that help you decide on the techniques and tools that present your findings effectively based on your audience, type of data, context, and end-goal of your analysis. Project Management skills to manage the process, people, dependencies, and timelines of the initiative. That brings us to your soft skills as a Data Analyst. Data Analysis is both a science and an art. You can ace the technical and functional expertise, but one of the key differentiators for your success is going to be soft skills. This includes your ability to work collaboratively with business and cross-functional teams; communicate effectively to report and present your findings; tell a compelling and convincing story; and gather support and buy-in for your work. Above all, being curious, is at the heart of data analysis. In the course of your work, you will stumble upon patterns, phenomena, and anomalies that may show you a different path. The ability to allow new questions to surface and challenge your assumptions and hypotheses makes for a great analyst. You will also hear data analysis practitioners talk about intuition as a must-have quality. It’s essential to note that intuition, in this context, is the ability to have a sense of the future based on pattern recognition and past experiences. In this video, we learned about the responsibilities and skillsets of a Data Analyst. In the next video, we will walk you through a day in the life of a Data Analyst.
Data is unorganized information that is processed to make it meaningful. Generally, data comprises of facts, observations, perceptions, numbers, characters, symbols, and images that can be interpreted to derive meaning. One of the ways in which data can be categorized is by its structure. Data can be: Structured; Semi-structured, or Unstructured. Structured data has a well-defined structure or adheres to a specified data model, can be stored in well-defined schemas such as databases, and in many cases can be represented in a tabular manner with rows and columns. Structured data is objective facts and numbers that can be collected, exported, stored, and organized in typical databases. Some of the sources of structured data could include: SQL Databases and Online Transaction Processing (or OLTP) Systems that focus on business transactions, Spreadsheets such as Excel and Google Spreadsheets, Online forms, Sensors such as Global Positioning Systems (or GPS) and Radio Frequency Identification (or RFID) tags; and Network and Web server logs. You can typically store structured data in relational or SQL databases. You can also easily examine structured data with standard data analysis methods and tools. Semi-structured data is data that has some organizational properties but lacks a fixed or rigid schema. Semi-structured data cannot be stored in the form of rows and columns as in databases. It contains tags and elements, or metadata, which is used to group data and organize it in a hierarchy. Some of the sources of semi-structured data could include: E-mails, XML, and other markup languages, Binary executables, TCP/IP packets, Zipped files, Integration of data from different sources. XML and JSON allow users to define tags and attributes to store data in a hierarchical form and are used widely to store and exchange semi-structured data. Unstructured data is data that does not have an easily identifiable structure and, therefore, cannot be organized in a mainstream relational database in the form of rows and columns. It does not follow any particular format, sequence, semantics, or rules. Unstructured data can deal with the heterogeneity of sources and has a variety of business intelligence and analytics applications. Some of the sources of unstructured data could include: Web pages, Social media feeds, Images in varied file formats (such as JPEG, GIF, and PNG), video and audio files, documents and PDF files, PowerPoint presentations, media logs; and surveys. Unstructured data can be stored in files and documents (such as a Word doc) for manual analysis or in NoSQL databases that have their own analysis tools for examining this type of data. To summarize, structured data is data that is well organized in formats that can be stored in databases and lends itself to standard data analysis methods and tools; Semi-structured data is data that is somewhat organized and relies on meta tags for grouping and hierarchy; and Unstructured data is data that is not conventionally organized in the form of rows and columns in a particular format In the next video, we will learn about the different types of file structures.
IBM DATA ANALYTICS COURSERA 2021 COPYRIGHT