International Journal of Innovative Research in Engineering & Multidisciplinary Physical Sciences
E-ISSN: 2349-7300Impact Factor - 9.907

A Widely Indexed Open Access Peer Reviewed Online Scholarly International Journal

Call for Paper Volume 13 Issue 4 July-August 2025 Submit your research for publication

Optimal Utilization of Computing Resources with Data Parallelism and Task Parallelism in building AI models

Authors: Vamshi Krishna Malthummeda

DOI: https://doi.org/10.37082/IJIRMPS.v13.i4.232655

Short DOI: https://doi.org/g9t99t

Country: United States

Full-text Research PDF File:   View   |   Download


Abstract: This article focuses on techniques for the most effective utilization of computing infrastructure for training and tuning the AI models in a distributed fashion used across various industries for making informed decisions. The objective is to investigate the potential benefits of data parallelism and task parallelism to optimize computing infrastructure which is in the form spark clusters on databricks. In addition the paper aims to illustrate how data parallelism and task parallelism can be achieved by setting up Ray cluster on top of databricks spark cluster to train and tune the Time series forecasting models to forecast sales of various products at stores in retail industry. The paper also discusses the significance of data parallelism and task parallelism in driving down the AI model building compute cost and time. The paper aims to provide the Machine Learning Engineers, Organization IT sponsors and all other stake holders actionable insights towards adoption of Ray cluster on top of databricks spark cluster. It presents detailed analysis of Time series data forecasting using Ray + Spark clusters combination facilitated by databricks. The paper aims to aid the organizations in enhancing the efficiency of computing resources, reducing the operational costs and speeding up the process of forecast model building using the solution presented.

Keywords: Data and Logical Parallelism, Ray cluster on databricks spark cluster, Prophet Time series data forecasting, Retail Sales, Machine Learning, Training and Tuning of AI Models, Optuna, Hyperparameter Optimization


Paper Id: 232655

Published On: 2025-07-25

Published In: Volume 13, Issue 4, July-August 2025

Share this