Stream

Join this community to post or comment
 
In this post I do a deep dive into the records of the all-time batting legends of cricket to identify interesting information about their achievements. In my opinion, the usual currency for batsman…
1
Add a comment...

Joann Dujardin

Discussion  - 
 
Consumers are inundated each day with new products and brands, making it increasingly difficult for them to differentiate one business from another.

Therefore, to stand out, you must figure out how to distinguish yourself. One way to do this is by focusing on ways to provide customers a unique experience, one which has been tailored to appeal to their exact needs and wants.

In our latest blogpost, we've explored how personalization creates the type of meaningful, memorable customer experience that drives conversion.
Comprehensive customer data is crucial to knowing who your customers are and what they value most. Here are 3 ways data will give you a competitive edge.
1
Add a comment...

RDBMS Tutorial

Discussion  - 
 
How to use data to power dynamically generated websites?
Brief overview and a simple diagram that explains how data driven websites actually work.
1
Add a comment...
 
Question about self-studying math for data mining/machine learning

I'm a web developer who graduated from a uni with a degree in IT. When I was a uni student, I took one data mining subject, from which I found DM/ML highly interesting. Even after I became a web developer, I'm still interested in it.
I'm very seriously thinking about applying for a PhD course of the field, hoping that I would be a scientist/researcher etc. However, my degree is not CS/Math/Data science, so I decided to prepare myself for it.
Recently I got a data mining textbook and teaching myself with it, but it's difficult for me to comprehend what the book says, mainly because it involves many mathematical terms which I don't know/remember.

Here is a list of math which I found from the book.
* Linear algebra (Used for cosine similarity)
* Statistics & probability (Used for covariance of numeric data etc. Maybe these are part of high school math)
* Chi-squared test (Used for analysing and processing data)
* Wavelet (Used for data processing)

I guess all of them are essential but I'm not 100% sure.

So, my question is, is it necessary to understand all of them above? (I think it is) And, do you know any other math which is necessary to comprehend data mining technologies (maybe machine learning too)?

Any advice would be appreciated!
Regards
1
1
Vitaliy Prooks's profile photoPetyo Vodenicharov's profile photo
 
Study completely foundations of calculus, linear algebra, probability theory, statistics and functional analysys. This will allow you to understand how machine learning algorithms work.
Add a comment...
 
Spark vs Hadoop which framework you choose?
https://acadgild.com/blog/spark-vs-hadoop/
Hadoop and Spark are the two terms that are frequently discussed among the Big Data professionals. But the big question is whether to choose Hadoop or Spark for Big Data framework. In this blog we will compare both these Big Data technologies, understand their specialties and factors which are attributed to the huge popularity of …
1
Add a comment...

Nikhil Dandekar

Discussion  - 
 
Human labels vs clicks for training a machine learned ranking model
Say you want to train a machine learned model to perform a ranking task. A common starter question is: Should I use human relevance…
1
Add a comment...

Manjunath N

Discussion  - 
2
Add a comment...

Henry Sneath

Discussion  - 
 
 
#scotus issues another important #patent decision in favor of patentees which overturns the dual objective/subjective test of the #Seagate case and setting the apparent new standard that enhanced damages are a "sanction for egregious infringement behavior." www.patentlyo.com blog provides the details.
by Dennis Crouch The Supreme Court today issued an important unanimous decision in Halo v. Pulse - vacating the Federal Circuit's rigid limits to enhanced …
2
1
Henry Sneath's profile photo
Add a comment...

astrid ayel

Announcements  - 
 

The Knowledge Transfer Network is hosting an exclusive event on Wednesday 20th July that will bring industry leaders together with high-growth potential entrepreneurs offering innovative digital solutions that use personal data to transform healthcare.

"This ‘Digital meets Health' networking session will open doors to larger brands who are looking to the digital community to implement agile innovation. Confirmed partners currently include Axa Health PPP, Merck Sharp & Dohme, GlaxoSmithKline, AstraZeneca, BUPA, Johnson & Johnson, Samsung and Eli Lilly.

The session will showcase companies developing disruptive data capture technologies, innovative data-driven applications and visualisation techniques, and new ways of sharing personal data.Up to 16 of the most innovative companies to apply to attend will be invited to meet with established industry brands, for an evening of facilitated networking.

The event will take place 5:30-8:30pm on Wednesday 20th July at the Digital Catapult in London.

Apply before noon on the 5th July here: http://bit.ly/DigitalHealthSpeedNetworking
1
Add a comment...

Siraj Raval

Discussion  - 
 
Curious how Google's newly released Parsey Mcparseface works? Check out my latest video https://www.youtube.com/watch?v=AKwfVAKaigI
1
1
Damian Dalle Nogare's profile photo
Add a comment...

About this community

It's a home for anybody data-curious, whether your data is big, small, square or scruffy. If you think you can decide better, do better, or be better through data, you belong here! Before posting, please check out our guidelines below.

Pavan Kumar

Discussion  - 
 
Python Pandas use in Data Science
4
Add a comment...

Shilo Rea

Discussion  - 
 
Under growing pressure to report accurate findings as they interpret increasingly larger amounts of data, researchers are finding it more important than ever to follow sound statistical practices. For that reason, a team of statisticians including Carnegie Mellon University's Robert E. Kass wrote 'Ten Simple Rules for Effective Statistical Practice.'
2
Add a comment...
 
 
The next innovation will be physical servers on the ground

;-)
We’re seeing a significant shift in the Platform as a Service and Infrastructure as a Service markets as the cloud becomes a global phenomenon, with enterprise adoption and regulatory pressures driving the need for better data governance and control. Cloud providers can no longer simply tell customers not to worry about where their data is …
1
Add a comment...

Manjunath N

Discussion  - 
Hadoop is hot. But its close cousin Spark is even hotter. Developed at UC Berkeley’s AMPLab, Apache Spark is a framework for performing data analytic’s on distributed cluster like Hadoop. It provides in-memory computations to increase speed and data process. It runs on top of existing Hadoop cluster and accesses the HDFS. It can also process structured data in …
1
Add a comment...

Anna Beckham

Discussion  - 
 
The new "Google Research, Europe" group in Zurich, Switzerland, will focus on various aspects of machine learning research.
4
Add a comment...

James Goode

Discussion  - 
 
Photo: PRNewsfoto Over the past few years, we’ve seen a steady rise in the importance of data analytics, in organizations as varied as consumer goods companies, professional sports franchises, political consultancies, medical research institutions, and financial firms. At the steering wheel of many of the analytics these organizations perform is not [...]
2
Add a comment...
 
 
This post will provide insights into NoSQL and as continuation, in next post I will provide short and precise tutorial on Mongodb . About Databases : Before diving into NoSQL , let me discuss the importance of databases.Dat...
3
Oleg Nizhnik's profile photoudendran mudaliyar's profile photo
5 comments
 
+Oleg Nizhnik so if your knowledge is above cassandra wiki , why don't you present the proof for your comment and this post mainly focus on NoSql and Mongodb , not on qualifying whether Cassandra AP ! , anyway thank you for your remarks
Add a comment...

Anna Beckham

Discussion  - 
 
Meet the people who can coax treasure out of messy, unstructured data.
2
Add a comment...