# Advanced Statistical Data Analysis Techniques for Research Problems

**What is Statistical Analysis?**

The process of data collection and uncovering of trends and patterns is called statistical analysis. Once the data has been collected, it can be studied, or analysed, for:

1. Summarizing the data – making a pie chart or bar graph

2. Finding key measures of location – finding the mean of a data set which gives you the average or the mid-point in the data set

3. Measuring the spread – this shows how the data is clustered, tightly or more spread out. Standard Deviation is the most commonly used measures of spread – it gives you the spread of the data around the mean.

4. Using past behaviour to make future predictions – Future predictions are very helpful in fields such as retail, banking, manufacturing, etc.

5. Testing a hypothesis – Only once the data has been analysed, can it actually answer questions and tell a story. In hypothesis testing, an effort is made to either prove or reject the commonly accepted theory or the null hypothesis.

Statistics can be defined as the practice or the science of collecting and analysing numerical data in large quantities, especially for the purpose of inferring proportions in a whole from those in a representative sample. Statistics also includes all aspects related to the data collection such as the design of the surveys and experiments. In statistics, there are two methodologies commonly used – descriptive statistics and inferential statistics.

**Descriptive statistics:** A descriptive statistic is a number derived from the data, such as mean (average) or standard deviation. Such descriptive statistics can help while the data is being examined in order to obtain a suitable set of relevant descriptive statistics. It can come handy when statistics are being compared which can give an idea about the similarities and differences between the data. For example, the reporting of examination marks requires descriptive statistics. For this, each score in the examination is compared to the total number of examinees that were taken. The scores are then analyzed to discover the average score or the mean, the exact middle score or the median, and the most frequently occurring score or the mode. The mean, mode and median are all estimates of central tendency – basically, these are descriptions used to see where each examinee taking the exam fits into the group.

**Inferential Statistics:** Once the data has been explored using visualization and descriptive techniques, the researcher needs to identify the formal statistical technique which would be required to investigate the data further in order to draw conclusions. For this purpose, there are many statistical techniques that have been developed to handle many different types of data and create relationships between them. For example, predictions made during the elections (especially those in the US) are based inferential or inductive statistics. Here, the data is gathered at different points during the campaign process by sampling the voting population and creating a subset. Each member of the subset is then asked who they will vote for. Different kinds of predictions are drawn from these responses.

**Statistical Analysis and Science**

Statistical analysis is used extensively in the study of all sciences, including social sciences.Statistics can help provide an approximation for an unknown quantity that is difficult or even impossible to measure. For example, certain topics in social science such as the study of consciousness or choice are almost impossible to measure. Statistical analysis can help here by shedding light on what would be the most likely or the least likely scenario.

We, at Precision Analytics India, employ a range of statistical analysis techniques in delivering solutions. Some of these techniques are:

**•** Descriptive statistics and data visualisation

**•** Correlation and regression

**•** Discriminant analysis

**•** Parametric and non-parametric tests

**•** Principal component and factor analysis

**•** Time series analysis

**•** Structural equation modelling