unit-3 statistics

           Statistics, the science of collecting, analyzing, presenting, and interpreting data. Governmental needs for  data as well as information about a variety of economic activities provided much of the early for the field of statistics. Currently the need to turn the large amounts of data available in many applied fields into useful information has stimulated both theoretical and practical developments in statistics.
bar graph
             
Data are the facts and figures that are collected, analyzed, and summarized for presentation and interpretation. Data may be classified as either quantitative or qualitative. Quantitative data measure either how much or how many of something, and qualitative data provide labels, or names, for categories of like items. For example, suppose that a particular study is interested in characteristics such as age, gender, marital status, and annual income for a sample of 100 individuals

 primary data and secondary data in statistics
Primary data
Primary data is the data that is collected for the first time through personal experiences or evidence, particularly for research. It is also described as raw data or first-hand information.
An advantage of using primary data is that researchers are collecting information for the specific purposes of their study. In essence, the questions the researchers ask are tailored to elicit the data that will help them with their study. Researchers collect the data themselves, using surveys, interviews and direct observations.

In the field of workplace health research, for example, direct observations may involve a researcher watching people at work. The researcher could count and code the number of times she sees practices or behaviours relevant to her interest; e.g. instances of improper lifting posture or the number of hostile or disrespectful interactions workers engage in with clients and customers over a period of time.

To take another example, let’s say a research team wants to find out about workers’ experiences in return to work after a work-related injury. Part of the research may involve interviewing workers by telephone about how long they were off work and about their experiences with the return-to-work process. The workers’ answers–considered primary data–will provide the researchers with specific information about the return-to-work process; e.g. they may learn about the frequency of work accommodation offers, and the reasons some workers refused such offers.

                Secondary data

Secondary data refers to data that is collected by someone other than the primary user. Common sources of secondary data for social science include censuses, information collected by government departments, organizational records and data that was originally collected for other research purposes.
There are several types of secondary data. They can include information from the national population census and other government information collected by Statistics Canada. One type of secondary data that’s used increasingly is administrative data. This term refers to data that is collected routinely as part of the day-to-day operations of an organization, institution or agency. There are any number of examples: motor vehicle registrations, hospital intake and discharge records, workers’ compensation claims records, and more.

Compared to primary data, secondary data tends to be readily available and inexpensive to obtain. In addition, administrative data tends to have large samples, because the data collection is comprehensive and routine. What’s more, administrative data (and many types of secondary data) are collected over a long period. That allows researchers to detect change over time.

Going back to the return-to-work study mentioned above, the researchers could also examine secondary data in addition to the information provided by their primary data (i.e. survey results). They could look at workers’ compensation lost-time claims data to determine the amount of time workers were receiving wage replacement benefits. With a combination of these two data sources, the researchers may be able to determine which factors predict a shorter work absence among injured workers. This information could then help improve return to work for other injured workers.

The type of data researchers choose can depend on many things including the research question, their budget, their skills and available resources. Based on these and other factors, they may choose to use primary data, secondary data–or both.

Comments