International Journal of Mathematical, Engineering and Management Sciences

ISSN: 2455-7749

Machine Learning in Big Data

Machine Learning in Big Data

Lidong Wang
Department of Engineering Technology, Mississippi Valley State University, Mississippi, USA.

Cheryl Ann Alexander
Technology and Healthcare Solutions, Inc., USA.

DOI https://dx.doi.org/10.33889/IJMEMS.2016.1.2-006

Received on March 15, 2016
Accepted on April 02, 2016


Machine learning is an artificial intelligence method of discovering knowledge for making intelligent decisions. Big Data has great impacts on scientific discoveries and value creation. This paper introduces methods in machine learning, main technologies in Big Data, and some applications of machine learning in Big Data. Challenges of machine learning applications in Big Data are discussed. Some new methods and technology progress of machine learning in Big Data are also presented.

Keywords- Big data, Machine learning, Big data analytics, Information technology, Stream processing.


Wang, L., & Alexander, C. A. (2016). Machine Learning in Big Data. International Journal of Mathematical, Engineering and Management Sciences, 1(2), 52-61. https://dx.doi.org/10.33889/IJMEMS.2016.1.2-006.

Conflict of Interest



Baldominos, A., Albacete, E., Saez Y. & Isasi, P. (2014). A scalable machine learning online service for big data real-time analysis. 2014 IEEE Symposium on Computational Intelligence in Big Data (CIBD): proceedings, 2014, IEEE, 1-8. DOI: http://dx.doi.org/10.1109/CIBD.2014.7011537

Bu, Y., Borkar, V., Carey, M. J., Rosen, J., Polyzotis, N., Condie, T., Weimer, M. & Ramakrishnan, R. (2012). Scaling datalog for machine learning on big data, March, arXiv:1203.0160v2 [cs.DB].

Chen, C. L. P. & Zhang, C.-Y. (2014). Data-intensive applications, challenges, techniques and technologies: A survey on Big Data. Information Sciences, 275(10), 314-347.

Dean, J. (2014). Big Data, Data Mining, and Machine Learning: Value Creation for Business Leaders and Practitioners. John Wiley & Sons, Inc.

Demchenko, Y., Grosso, P., Laat, D. C. & Membrey, P. (2013). Addressing big data issues in scientific data infrastructure. 2013 International Conference on Collaboration Technologies and Systems (CTS), 20-24 May 2013, San Diego, CA, USA, 48-55.

Hido, S., Tokui, S. & Oda, S. (2013). Jubatus: An open source platform for distributed online machine learning, technical report of the joint jubatus project by preferred infrastructure inc., and ntt software innovation center, Tokyo, Japan, NIPS 2013 Workshop on Big Learning, Lake Tahoe. December 9, 2013, 1-6.

HOI, S., Wang, J., Zhao, P. & Jin, R. (2012). Online feature selection for mining big data. BigMine '12 Proceedings of the 1st International Workshop on Big Data, Streams and Heterogeneous Source Mining: Algorithms, Systems, Programming Models and Applications, 2012, ACM New York, NY, USA, 93-100.

Jagadish, H. V., Gehrke, J., Labrinidis, A., Papakonstantinou, Y., Patel, J. M., Ramakrishnan, R., & Shahabi, C. (2014). Big data and its technical challenges. Communications of the ACM, 57(7), 86-94.

Jaswant, U. & Kumar, P. N. (2015). Big data analytics: a supervised approach for sentiment classification using mahout: an illustration. International Journal of Applied Engineering Research, 10(5), 13447-13457.

Kanagavalli, S., Vaishali, S. & Jeba, J. L. (2015). Analysis and mining of social network data for society issues by using big data. International Journal of Applied Engineering Research, 10(4), 10497-10506.

Karacapilidis, N., Tzagarakis M. & Christodoulou, S. (2013). On a meaningful exploitation of machine and human reasoning to tackle data-intensive decision making. Intelligent Decision Technologies, 7, 225–236.

Kotsiantis, S. B. (2007). Supervised machine learning: a review of classification techniques, Informatica, 31, 249-268.

Lee, K. M. (2014). Grid-based single pass classification for mixed big data, International Journal of Applied Engineering Research, 9(21), 8737-8746.

Li, L. (2015). Experimental comparisons of multi-class classifiers. Informatica, 39, 71–85.

Low, Y., Gonzalez, J., Kyrola, A., Bickson, D., Guestrin, C. & Hellerstein, J. M. (2012). Distributed graphlab: a framework for machine learning and data mining in the cloud. The 38th International Conference on Very Large Data Bases, August 27- 31, 2012, Istanbul, Turkey. Proceedings of the VLDB Endowment, 5 (8), 716-727.

Najafabadi, M. M., Villanustre, F., Khoshgoftaar, T. M., Seliya, N., Wald R. & Muharemagic, E. (2015). Deep learning applications and challenges in big data analytics. Journal of Big Data, 2(1), DOI 10.1186/s40537-014-0007-7.

Nasridinov, A. (2014). Combining unsupervised and supervised machine learning to analyze crime data. International Journal of Applied Engineering Research, 9(23), 18663-18669.

O'Leary, D.E. (2013). 'Big data', the 'internet of things' and the 'internet of signs'. Intelligent Systems in Accounting, Finance and Management, 20, 53-65.

Qian, H. (2014). PivotalR: a package for machine learning on big data. The R Journal, 6(1), 57-67.

Sukumar, S. R. (2014). Machine learning in the big data era: are we there yet? Conference: ACM Knowledge Discovery and Data Mining: Workshop on Data Science for Social Good, Oak Ridge National Laboratory, December 8, 2014, 1-5.

Suthaharan, S. (2014). Big data classification: problems and challenges in network intrusion prediction with machine learning. Performance Evaluation Review, 41(4), 70-73.

Tarwani, K. M., Saudagar, S. S. & Misalkar, H. D. (2015). Machine learning in big data analytics: an overview. International Journal of Advanced Research in Computer Science and Software Engineering, 5(4), 270-274.

Turk, M. (2012). A chart of the big data ecosystem, take 2. http://mattturck.com/2012/10/15/a-chart-of-the-big-data-ecosystem-take-2/

Vu, A. T., De Francisci Morales, G., Gama, J., & Bifet, A. (2014, October). Distributed adaptive model rules for mining big data streams. In Big Data (Big Data), 2014 IEEE International Conference on (pp. 345-353). IEEE.

Wang, J.-W., Tang, Y., Nguyen, M. & Altintas, I. (2014). A scalable data science workflow approach for big data bayesian network learning, BDC '14 Proceedings of the 2014 IEEE/ACM International Symposium on Big Data Computing. IEEE Computer Society Washington, DC, USA, 16-25.

Zafarani, R., Abbasi, M. A. & Liu. H. (2014). Social media mining: an introduction, April 20, Cambridge University Press, UK.

Zaslavsky, A., Perera C. & Georgakopoulos, D. (2012). Sensing as a service and big data. International Conference on Advances in Cloud Computing (ACC), Bangalore, India, July 2012, 1-8.