Manufacturing Defects Synthetic DataΒΆ
In this notebook we generate some data that will represent measurements of defects in a manufacturing setting.
import numpy as np
import pandas as pd
#generate synthetic data
Factors = []
Outcome = []
numpoints = 2000
for workday, time_per_task in zip(np.random.normal(loc=.3, scale=.05, size=numpoints), np.random.normal(loc=.05, scale=.01, size=numpoints)):
Factors.append([workday, time_per_task])
Outcome.append( 0*workday**2/(time_per_task**2) + 1/time_per_task**1.5 + 1000*workday**1.5)
data = pd.DataFrame(Factors, columns=['Workday', 'Time per Task'])
data['Defect Rate'] = Outcome
data['Defect Rate']/= data['Defect Rate'].max()*10
data['Defect Rate'] += np.random.normal(scale=.003, size=len(data['Defect Rate']))
data.head()
Workday | Time per Task | Defect Rate | |
---|---|---|---|
0 | 0.357563 | 0.036497 | 0.066678 |
1 | 0.300276 | 0.035329 | 0.063891 |
2 | 0.301040 | 0.054992 | 0.049828 |
3 | 0.290333 | 0.046289 | 0.046932 |
4 | 0.384306 | 0.050605 | 0.064480 |
data.to_csv('Manufacturing_Defects_Synthetic_Data.csv')