Join this community to post or comment

Chandan Nishad

Discussion  - 
Hi everyone , 

I am trying to extract the data from HDFS to R.. after importing the data to R, I can see only 7,40,726 no. of rows only, but the actual no. of rows in the data is around 11,55,000 and the size of data is 105MB only.
below is the code i have written to extract the data.



# To read the data from HDFS.
f = hdfs.file("/tmp/projectdata/churndata/Apr_data1.csv","r",buffersize=104857600)
m =
c = rawToChar(m)
data = read.table(textConnection(c), sep = ",",fill = TRUE);

Kindly let me know what should I do, in order to extract the complete data(11,55,000 rows). 
Waiting for your kind response on the same.

Thank you,
Add a comment...

Swetha Nomula

Discussion  - 
Big Data Hadoop Online Training and Placement Assistance

The Big Data and Hadoop Training course from H2KInfosys is specially designed in such a way that everyone can enhance their knowledge and skills to become a successful Hadoop developer. You can become a Hadoop expert by mastering the most important features of Hadoop Big Data like Mapreduce, HDFS, SQOOP and many more.

Attend our Big Data Hadoop Online Training Demo for free.

Our Big Data Course Special Features:
• Structured Course Curriculum Content.
• One Time Pay-Life time access to all videos and sessions.
• Daily Assignments and weekly tests.
• Unlimited mock interview sessions.
• Resume Preparation.
• 100% Job Placement Assistance.

For more details:
Email :
Call us:
USA:  +1 770-777-1269

#BigData  #Hadoop  #ITTraining  
Add a comment...
The Chairman, CEO, and President, Brian William Mead, is a serial Entrepreneur within the Technology Software Delivery Services business , having started and run three over the past 20 years. Mr. mead, understand the value proposition surrounding a quality delivery services organization, having most recently built, led and managed a highly successful Full Service SAP Organization out of Chicago Il. Known as ProSoft Technology Group, inc. recently acquired by Kellton Technologies, a publically traded India Services Company. Join our community for more information.
Business Transformation is your "Touchstone"
View community
Add a comment...
Getting #StartwithHadoop 
Enroll Here :
Add a comment...

Arti Prasad

Discussion  - 
Gartner had predicted, “Hadoop will be in most advanced analytics products by 2015.”
So join us and Upgrade your skills 
We have our upcoming Online and Classroom Hadoop administration training commencing from Oct 24th, 25th 2015...
Please contact us at or call us @ +91-9008587999
For more details -
Our Blog -
Add a comment...

Usha Chandrika

Discussion  - 
‪#‎Hadoop was introduced to meet the big data issues in the organizations’ database. And, by learning about Map reduce you can know the crux of the big data. However,@KernelTraining.Com you can register a free‪#‎webinar at
Add a comment...

Usha Chandrika

Discussion  - 
Which freaking Hadoop engine should I use? These four truths will help you determine which Hadoop technology to use for the types of workloads you… - KernelTraining.Com - Google+
Add a comment...


Discussion  - 
Add a comment...

Usha Chandrika

Discussion  - 
What is the average salary of #VMware administrator? Learn #VMware #online course by industry expert. Kernel Training provides professional #VMware #online #training. You can attend a free #webinar demo #class, register at #KernelT
Add a comment...
How #bigdata Distribution is Processed ?
Learn #hadoop Online Training 
Learn More :
Add a comment...

Arti Prasad

Discussion  - 
“Data is the new science. Big Data holds the answers.” –   Pat Gelsinger We have some of the Latest Technology and Job opportunities in Big Data Area Please go through this Link for more information :
Add a comment...

Pragya Jain

Discussion  - 
Hello Friends ,I have q question regarding running mapreduce Job  using oozie my problem is - when i run mapreduce job in oozie it through error "variable [nameNode] cannot be resolved".Plz guide me where i am wrong
Add a comment...

rooz munjal

Discussion  - 
Participate and win 50% Off on Beginner Courses
Click Now
Add a comment...

Arti Prasad

Discussion  - 
We are proudly announcing that our next Big Data Bootcamp (Online Class)
are open for registration for US Location ( Weekend classes Saturday and Sunday 4 hours each for 3 weeks)

Online Batch dates for July and August 2015:
July 25th and 26th
August 1st and 2nd
August 8th and 9th

Please visit our website & REGISTER - Netscientium .com
send email to

Hurry up as seats are limited for the special offer.
We have exciting early bird offers and buddy offers up-to 25% if registered early..


IT professionals with programming experience, J2EE architects, developers, enterprise/solution architects and technology managers, who are looking to gain in-depth knowledge and redefine the enterprise
analytics platform.


-Interactive classes.
-Our trainers have over 12–15 years of experience in the field of Big Data.
-Access to our cloud server for testing cluster set up.
-Hadoop Cluster set-up project and weekly calls with our developers for two weeks after the 3-day session.
-Hands-on training on how to approach a Big Data project.
-Weekend mentoring discussions with our technical experts.
-Quality assurance: if you are not satisfied, your next training is free.
-Step-by-step approach for you to be a certified Big Data Analyst.
-Life-time access to the learning management systems (LMS).
-Technical support for assignments, queries through email
-Batch rescheduling flexibility! Incase you miss a class… fret not! You can join the next batch.


Day 1
-What is Big Data?
-Why Hadoop? Hadoop Overview and its ecosystem.
-Big Data, Hadoop 2.X architecture.
-HDFS (Hadoop Distributed File System) and YARN architecture.
-MapReduce Anatomy, developing MapReduce programs.
-Advanced MapReduce concepts, coding.

Day 2
-Hadoop security.
-Advanced MapReduce algorithms, advanced tips/techniques.
-Setting up a Hadoop cluster (2 Nodes: 1.2 and 2.3).
-Monitoring and management of Hadoop cluster.
-Pig, Pig Latin, Pig with HDFS, UDF.
-Hadoop design patterns and architecture principles.

Day 3
-Sqoop, importing and exporting data using Sqoop.
-Hive: Hive architecture and coding.
-Flume: Use cases, Installation, practicals.
-HBase: Architecture, CRUD, scanning and filters.
-Project use case coding.
-Post-training (2 Weekend sessions).

Big Data has taken the business world by storm. It is everywhere, telling businesses how to sell. Everybody wants to know how they can gain the ‘right’ information to better support their customers, elevate their market strategy and forecast their business’ future.

There is already a massive archive of data and it is only growing! CIOs are constantly looking for knowledgeable/trained Big Data professionals to help them glean valuable information that is hidden in all that data. At Netscientium, we offer you a three-day comprehensive/detailed course that will ensure that you succeed in your career.

Here you are invited to a world of new information and possibilities through our Hallmark program in Big Data.

We believe that immersive education is the best way to learn how to code, create solutions, take effective architectural and design decisions and manage the team. At the Bootcamp you will gain extensive/in-depth knowledge of Java, Hadoop and the Big Data toolset and cloud.
Become an expert at Big Data analytics!
Add a comment...