Use the SOCR Jobs Data to practice Apriori Association Rule learning
- Load the Jobs Data
- Use this guide to load HTML data data
- Focus on the Description feature. Replace all underscore characters “_" with spaces
- Save the data using
write.csv()
and then use the read.transactions()
in arules
package to read the CSV data file. Visualize the item support using item frequency plots
- Generate the sparse terms matrix for each job category. What terms appear as more popular?
- Fit a model:
myrules<-apriori(data=jobs,parameter=list(support=0.1,confidence=0.8,minlen=1))
. Try out several rule thresholds trading off gain and accuracy
- Evaluate model performance with
lift
- Try to improve the model performance
- Sort the set of association rules
- Investigate associations that may be linked to a specific job-description terms.
SOCR Resource Visitor
number