Story By:
You Li
What kind of data do you need? Elaborate this for each small story
• zipcode
• road conditions rating
• the median income of the zip code/city
• the crime rate of the neighborhood
• the racial ratio of the city
• the individual below property level
Why do you need the data?
zipcode and road conditions based on the PASER rating will answer the first question; The median income, crime rate, and racial ratio data will help address the other questions
How do you use the data in your Machine Learning applications?
• I expect the ML to search the road conditions according to the zip code
• I expect the ML to visualize the road conditions according to the zip code
• I expect the ML can correlate the crime rate with the average road rating in each zip code
• I expect the ML can correlate the racial ratio of the zipcode to the average road rating
• I expect the ML can correlate the median household income level of the zipcode to the average road rating
• I expect the ML can correlate the individual property level of the zipcode with the average road rating
• I expect the ML can retrieve all of the road conditions rated as poor upon command and their neighborhood profiles
• I expect the ML can retrieve all of the road conditions rated as fair upon command and their neighborhood profiles.
• I expect the ML can retrieve all of the road conditions rated as good upon command and their neighborhood profiles.
How do you get the data?
Road condition data: Southeast Michigan Council of Governments has the road condition data and will release it upon request https://semcog.org/pavement; the SEMCOG has visualized the road conditions in the neighborhood but not searchable by zipcode or associate the data with the census. https://maps.semcog.org/PavementCondition/
The U.S. census has the searchable demographic by zip code/city/county: https://factfinder.census.gov/faces/nav/jsf/pages/community_facts.xhtml?src=bkmk
Do you know any specific data set you may use for analysis of your story?
road conditions data from SEMCOG (upon request)
The Census data by zip code https://factfinder.census.gov/faces/nav/jsf/pages/community_facts.xhtml?src=bkmk
Data Type for each your data – for examples, number, text, images, video, audio, social network, database
numbers, text, database; users could also submit images of the roads that they commute.
What kind of analysis would you do? (Correlation, frequency, cross-tabulation, average, classification, regression, ANOVA, etc.)
frequency counts of the road conditions based on the PASER rating, correlation between demographic data and road conditions by zip code, t-test of the road conditions between neighborhoods of different demographic characteristics, Other suggestions?
What kind of measurements would you use? (Data quality, privacy, etc.)
not sure about this question.
What kind of visualization would be useful? (Submit your visual here)
color coding the road condition, searchable panel with the zipcode, and the result should display road condition levels, and the neighborhood demographic data. Likewise, from a user’s perspective, if I enter my demographic data like income, race, education, it can predict the road condition I may live with.