📝 Graph Algorithms
August 20, 2019
Data in a graph isn't a bell curve. Its in the Power-law distribution. There are usually lots of nodes with few relationships and some nodes with lots of relationships.
Many approaches erroneously focus on the average population where few entities actually exist.
Graphs help you invest in populated areas.
Graphs are good at:
- pathfinding and search
- centrality / importance
- community detection (understanding)
- link prediction
Graph algorithms allow you to sort categories of data into buckets. This talk gives an example using the yelp public dataset and how Will went for sorting things by 100s of categories down to 15 greater categories. Will used an overlap similarity algorithm in neo4j to determine the categories that would be more general (one with more nodes) than one that is specific.
2 general approaches building a recommendation system:
content based vs collaborative filtering
In their example, they want photo based recommendations.
- Similar photos using Jaccard similarity
- clustering similar photos using label propagation (community detection)
- recommend businesses connected photos in the same community
They used googles vision api to get text labels for photos.
When things share a lot of labels, they are "similar to" each other.
Ask user for photos that they "like". Then neo4j finds the communities those photos belong to and the businesses that are connected to those photos.
Triangles and clustering coefficients can be used for predictive relationships.