Monday, April 25, 2016

Week 11 Concluding Post

Hello Readers,

Next week is my presentation and so this means that I have reached the end of my Senior Research Project, and, therefore, updating my blog. Although I may post a couple additional updates, this is most likely the final post. My program is finalized and works well with no issues as far as I can tell.

Thank you all for reading. If you have any questions about whatever, just comment on this post. If your question is about a specific post, reference the title of that post in your comment.

Thanks for reading and have a great day!


3 comments:

  1. Hi,
    Great job and good luck for your job!
    So, i want to ask some questions and if it is possible, can you guide me please?
    I have to implement also the pagerank algorithm on spark and to do some manipulation with a small sample of data, then, extend the volume to study the comportment and performance of spark...But, i don t know how to begin! what must i install before??? and how do it?
    thank you in advance

    ReplyDelete
  2. First of all I'd like to note that this site is not maintained. You are of course welcome to comment, however, any comments may or may not be answered in due time. I would recommend using http://stackoverflow.com/ for any specific questions as that is a large forum with many programmers from diverse backgrounds and therefore, may be able to provide a more thorough and specific answer.
    That said, thank you for your interest!
    Your question is quite vague, and your specific needs and skills may differ from the ones that I posted about.
    I used the Java programming language and Eclipse (Juno at the time, although I am sure you can use any other recent eclipse releases, such as Mars). You can download spark itself from the following link: http://spark.apache.org/downloads.html
    That link also provides documentation for Spark, from which you can learn more about it (which may be necessary as you plan to use it in your program).
    I created the java project within the org.apache.spark.examples package (I am not sure whether or not this is necessary).
    From there, you can utilize Spark's libraries within the project. In order to utilize the libraries, however, you will first need to import Spark's jar files into your referenced libraries. To do this,
    Right Click on the Java Project (under the Package Explorer tab that is typically located on the left had side of the screen) -> Click on Build Path -> Click on Add External Archives...
    From there, locate the file in which you saved Spark (when downloading it). Click on the file labeled "spark-1.6.0-bin-hadoop2.6" (this may need to be unzipped, additionally, your particular version may be different). Click into the file labeled "lib". In there, you should find several jar files, select all of these (hold shift while clicking to select multiple) and click OK. This should copy the jar files into your build path. To check, you can look under the drop down "Referenced Libraries", inside of the project drop down (which was in the package explorer).
    This should allow you to import various spark libraries into the java project, and initialize spark contexts (within the code).
    I had some difficulty with this process as well, so I understand your confusion.
    If this was not a satisfactory answer to your question, I highly encourage you to visit the stackoverflow website that I mentioned before and post there (I am not affiliated in any way with stackoverflow, I'm just a fan of the site).

    ReplyDelete
    Replies
    1. Hi,

      Very well thank you for your precious help and advises,
      I ll turn to stack Overflow for more explications, so, i just want to use spark directly on my machine without using java, furthermore, use scala language to run scripts.

      Thank you.

      Delete