Source - Référence: (Chang et al. 2008)
Utilisation d’expressions régulières. - Par exemple: Logstash
Géo-localisation - Par exemple: Logstash
Source (Vinyals et al. 2017; Sivic and Zisserman 2009)
Source - Note: Les modèles de forêts aléatoires et de gradient boosting construisent plusieurs arbres pour obtenir davantage de précision. Ce sont des modèles très performants, mais difficiles à expliquer.
Trouver le chemin le plus court répondant à une ou plusieurs contraintes.
Exemple: Easyrec Source
[Source: https://en.wikipedia.org/wiki/Anscombe%27s_quartet]
Il est de plus en plus facile de cacher la discrimination au travers d’algorithmes. Il suffit d’entraîner l’algorithme sur des données basées sur des décisions passées pour y inclure tous les biais des personnes qui ont prises ces décisions.
(Ribeiro, Singh, and Guestrin 2016)
Airbnb Engineering & Data Science: Architecting a Machine Learning System for Risk
Much of our current educational system can be described as “memorize, regurgitate, and forget.” Students learn to “study for the test. […] Computers are very good in storage, retention, and regurgitation.”
Les algorithmes d’apprentissage profond peuvent identifier des tumeurs cancéreuses dans l’imagerie médicale.
Ex: Meilleure détection du cancer du sein
Enjeu: il faut le consentement du patient pour partager les images.
Weapons of math destruction, which O’Neil refers to throughout the book as WMDs, are mathematical models or algorithms that claim to quantify important traits: teacher quality, recidivism risk, creditworthiness but have harmful outcomes and often reinforce inequality, keeping the poor poor and the rich rich. They have three things in common: opacity, scale, and damage. They are often proprietary or otherwise shielded from prying eyes, so they have the effect of being a black box. They affect large numbers of people, increasing the chances that they get it wrong for some of them. And they have a negative effect on people, perhaps by encoding racism or other biases into an algorithm or enabling predatory companies to advertise selectively to vulnerable people, or even by causing a global financial crisis. Review: Weapons of Math Destruction
Systèmes de recommendations:
Segmentation de graphe:
Les gouvernements utilisent le prétexte de la détection de nouveaux modèles de criminalité pour demander de plus en plus de données sur l’usage des moyens de communications par les citoyens.
the future-orientation increasingly severs surveillance from history and memory and the quest for pattern-discovery is used to justify unprecedented access to data
(Lyon 2014)
The NYPD is notorious for its intransigence on open records requests from the press and the public, particularly concerning documentation about the department’s extensive use of surveillance technology. In recent years, lawsuits have been filed to disclose information about the department’s network of surveillance cameras, its use of X-ray scanners in public, and the deployment of facial recognition technology
Transparency Advocates Win Release of NYPD “Predictive Policing” Documents
Modern machine learning for natural language processing is able to do things like translate from one language to another, because everything it needs to know is in the sentence its processing - Ian Goodfellow, OpenAI
Angles, Renzo, and Claudio Gutierrez. 2008. “Survey of Graph Database Models.” ACM Computing Surveys (CSUR) 40 (1). ACM: 1.
Anscombe, F. J. 1973. “Graphs in Statistical Analysis.” The American Statistician 27 (1). Taylor & Francis: 17–21. doi:10.1080/00031305.1973.10478966.
Chang, Fay, Jeffrey Dean, Sanjay Ghemawat, Wilson C Hsieh, Deborah A Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, and Robert E Gruber. 2008. “Bigtable: A Distributed Storage System for Structured Data.” ACM Transactions on Computer Systems (TOCS) 26 (2). ACM: 4.
El-Abbasy, Mohammed S, Ahmed Senouci, Tarek Zayed, Farid Mirahadi, and Laya Parvizsedghy. 2014. “Artificial Neural Network Models for Predicting Condition of Offshore Oil and Gas Pipelines.” Automation in Construction 45. Elsevier: 50–65.
Garcia, Mari Cruz, Miguel A Sanz-Bobi, and Javier del Pico. 2006. “SIMAP: Intelligent System for Predictive Maintenance: Application to the Health Condition Monitoring of a Windturbine Gearbox.” Computers in Industry 57 (6). Elsevier: 552–68.
Han, Jing, E Haihong, Guan Le, and Jian Du. 2011. “Survey on Nosql Database.” In Pervasive Computing and Applications (Icpca), 2011 6th International Conference on, 363–66. IEEE.
Lyon, David. 2014. “Surveillance, Snowden, and Big Data: Capacities, Consequences, Critique.” Big Data & Society 1 (2): 2053951714541861. doi:10.1177/2053951714541861.
Manning, Christopher, Mihai Surdeanu, John Bauer, Jenny Finkel, Steven Bethard, and David McClosky. 2014. “The Stanford Corenlp Natural Language Processing Toolkit.” In Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations, 55–60.
Menk dos Santos, Alan. 2015. “A Hybrid Recommendation System Based on Human Curiosity.” In Proceedings of the 9th Acm Conference on Recommender Systems, 367–70. ACM.
Moniruzzaman, ABM, and Syed Akhter Hossain. 2013. “Nosql Database: New Era of Databases for Big Data Analytics-Classification, Characteristics and Comparison.” arXiv Preprint arXiv:1307.0191.
Nguyen, Tien T, Pik-Mai Hui, F Maxwell Harper, Loren Terveen, and Joseph A Konstan. 2014. “Exploring the Filter Bubble: The Effect of Using Recommender Systems on Content Diversity.” In Proceedings of the 23rd International Conference on World Wide Web, 677–86. ACM.
Ribeiro, Marco Tulio, Sameer Singh, and Carlos Guestrin. 2016. “Why Should I Trust You?: Explaining the Predictions of Any Classifier.” In Proceedings of the 22nd Acm Sigkdd International Conference on Knowledge Discovery and Data Mining, 1135–44. ACM.
Sivic, Josef, and Andrew Zisserman. 2009. “Efficient Visual Search of Videos Cast as Text Retrieval.” IEEE Transactions on Pattern Analysis and Machine Intelligence 31 (4). IEEE: 591–606.
Terveen, Loren, and Will Hill. 2001. “Beyond Recommender Systems: Helping People Help Each Other.” HCI in the New Millennium 1 (2001). Addison-Wesley, Reading, MA: 487–509.
Vinyals, Oriol, Alexander Toshev, Samy Bengio, and Dumitru Erhan. 2017. “Show and Tell: Lessons Learned from the 2015 Mscoco Image Captioning Challenge.” IEEE Transactions on Pattern Analysis and Machine Intelligence 39 (4). IEEE: 652–63.
Yang, Diyi, Tanmay Sinha, David Adamson, and Carolyn Penstein Rosé. n.d. “Turn on, Tune in, Drop Out: Anticipating Student Dropouts in Massive Open Online Courses.” In.
Zhang, Lei, Fan Yang, Yimin Daniel Zhang, and Ying Julie Zhu. 2016. “Road Crack Detection Using Deep Convolutional Neural Network.” In Image Processing (Icip), 2016 Ieee International Conference on, 3708–12. IEEE.