Student: Farhan Khan
Project Title: Scheduling in the
Spark Cloud Computing System
Location: ASU Goldwater Center
BASIS Advisor: Dr. Pearson
On-site Advisor: Dr. Lei Ying
On-site Advisor Contact
Information:
Office: 436
Goldwater Center
Phone (o):
480-965-7003
Fax:
480-965-3837
Email:
lei.ying.2@asu.edu
Mail: Arizona
State University
PO Box 875706
Tempe, AZ 85287-5706
PO Box 875706
Tempe, AZ 85287-5706
Mode of Daily Contact: Blog
Course Goals: Scheduling in the Spark Cloud Computing
System has a few main objectives: first, I will get familiar with the Spark
System and scheduling algorithm; second, I will answer the question, how can
scheduling improve cloud computing systems; and third, I will attempt to implement
scheduling in the Spark Cloud Computing System. In pursuit of these goals, I
will firstly answer the following question. What is networking for big data? Secondly,
I will familiarize myself with the programming language, syntax, and different
types of computations within Data Centers. Thirdly, I will learn Spark system,
understand the architecture and learn to use Spark to process big data. All of
this information will be learned from books, scholarly articles, and Dr. Lei
Ying and his students.
Course Texts:
Leskovec, Jure, Anand Rajaraman,
Jeffrey Ullman. Mining of Massive Datasets.
Palo Alto: Cambridge University Press, 2014. Print.
Ousterhout, K., Wendell, P.,
Zaharia, M., & Stoica, I. (2013, September 24). Sparrow: Distributed, Low Latency Scheduling. Retrieved from
http://www.eecs.berkeley.edu/~keo/publications/sosp13-final17.pdf
Flanagan, David. JavaTM in a Nutshell, Second
Edition. Sebastopol: O’Reilly & Associates, Inc., 1997. Print
Lippman, Stanley B. C++
Primer. 2nd ed. Reading: Addison-Wesley, 1997. Print.
Wang, Weina, Kai Zhu, Lei Ying,
Jian Tan, Li Zhang. Map Task Scheduling
in MapReduce with Data Locality: Throughput and Heavy-Traffic Optimality. PDF.
Zaharia, Matei, Mosharaf
Chowdury, Tathagata Das, Ankur Dave, Justin Ma, Murphy McCauley, Michael
Franklin, Scott Shenker, Ion Stoica. Resilient
Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing.
PDF.
Purohit, Abhijeet, Md. Abdul
Waheed, Asma Parveen. “LOAD BALANCING IN PUBLIC CLOUD BY DIVISION OF CLOUD
BASED ON THE GEOGRAPHICAL LOCATION.” International
Journal of Research in Engineering and Technology 3.3 (2014): 316 – 320.
Print.
Begum, Suriya, Dr. Prashanth.
“Review of Load Balancing in Cloud Computing.” International Journal of Research in Engineering and Technology
10.1 (2013): 343 – 352. Print.
Project Product Description:
I will write code for a program
that conveys my understanding of the programming language, scheduling, and load
balancing. The final product will be a working program that has been thoroughly
tested.
No comments:
Post a Comment