Profile cover photo
Profile photo
Diggernaut LLC
2 followers
2 followers
About
Posts

Post has attachment
Web sсraping in independent journalism

In the modern information environment the collection, processing and analysis can take a long time. Every year information becomes larger, and hence the time required to collect it becomes bigger and bigger. And if earlier the journalist had to manually search for and gather information from different sources, then it was processed, organized to ordered data structure, analyze and use in their work, but now he has the opportunity to automate this process, saving time for more important things. The process of automated data collection from the Internet called Web scraping.

Until recently, in order to engage in web scraping required certain knowledge, in particular programming languages, such as, for example, Python or Ruby, because he had to write a program to take the data and process it without human intervention. In addition to the programming language, the journalist also had to learn the markup language HTML and CSS styles as it was simply impossible to build scraper without this knowledge. These factors greatly limit the use of this method for independent journalism, as if the big media companies could afford to hire a staff of programmers for this task, the independent journalists, for the most part either tried to gain the necessary knowledge, or abandon the idea of ​​automating the information collection process.

But now, this technology has become much closer to independent journalism through the development of innovative companies working in this field. One such company is Diggernaut. The main differences between our services from similar systems are support for complex nested data structures and the ability to work with the service without any special knowledge. This greatly increases the attractiveness of this method for independent journalists. We were able to achieve this by creating a special application for Google Chrome, with which the user can quickly and easily create scripts describing the logic of the diggers. The app is easy to learn through a series of video lessons. Another advantage of the service is that it is not necessary to pay for the resources in debug mode. This will let you learn the service and get used to the system, without spending a cent. The service works on the freemium scheme, it is possible to work for free, but to obtain the additional resources necessary to move to a paid plan. In fairness it should be mentioned such services as ScrapingHub and import.io, however, according to the test that we conducted, Diggernaut service application is more understandable and easier to learn, and the cost per request is cheaper than others services sell.

In conclusion, we can say that with the increasing power of computers we can now use machine learning and based on these algorithms to solve this problem, therefore, is likely in the near future, innovative companies will present their solutions for which human intervention is minimized. Just imagine a system in which will not need to describe the logic for scraper and it will automatically detect and extract data from any specified web resource. It is already possible for certain sectors and this gives us confidence that the technology eventually will allow us to fully automate the process of working with data and optimize our time spent.


Post has attachment
Web sсraping in independent journalism

In the modern information environment the collection, processing and analysis can take a long time. Every year information becomes larger, and hence the time required to collect it becomes bigger and bigger. And if earlier the journalist had to manually search for and gather information from different sources, then it was processed, organized to ordered data structure, analyze and use in their work, but now he has the opportunity to automate this process, saving time for more important things. The process of automated data collection from the Internet called Web scraping.

Until recently, in order to engage in web scraping required certain knowledge, in particular programming languages, such as, for example, Python or Ruby, because he had to write a program to take the data and process it without human intervention. In addition to the programming language, the journalist also had to learn the markup language HTML and CSS styles as it was simply impossible to build scraper without this knowledge. These factors greatly limit the use of this method for independent journalism, as if the big media companies could afford to hire a staff of programmers for this task, the independent journalists, for the most part either tried to gain the necessary knowledge, or abandon the idea of ​​automating the information collection process.

But now, this technology has become much closer to independent journalism through the development of innovative companies working in this field. One such company is Diggernaut. The main differences between our services from similar systems are support for complex nested data structures and the ability to work with the service without any special knowledge. This greatly increases the attractiveness of this method for independent journalists. We were able to achieve this by creating a special application for Google Chrome, with which the user can quickly and easily create scripts describing the logic of the diggers. The app is easy to learn through a series of video lessons. Another advantage of the service is that it is not necessary to pay for the resources in debug mode. This will let you learn the service and get used to the system, without spending a cent. The service works on the freemium scheme, it is possible to work for free, but to obtain the additional resources necessary to move to a paid plan. In fairness it should be mentioned such services as ScrapingHub and import.io, however, according to the test that we conducted, Diggernaut service application is more understandable and easier to learn, and the cost per request is cheaper than others services sell.

In conclusion, we can say that with the increasing power of computers we can now use machine learning and based on these algorithms to solve this problem, therefore, is likely in the near future, innovative companies will present their solutions for which human intervention is minimized. Just imagine a system in which will not need to describe the logic for scraper and it will automatically detect and extract data from any specified web resource. It is already possible for certain sectors and this gives us confidence that the technology eventually will allow us to fully automate the process of working with data and optimize our time spent.


Add a comment...

Post has attachment
Add a comment...

Post has attachment
Add a comment...

Post has attachment
Diggernaut - Leave the work to the robots!
Add a comment...
Wait while more posts are being loaded