Font
Large
Medium
Small
Night
Prev Index    Favorite Next

Chapter 12 Treat yourself like a donkey

Of course, it is not enough for the Stock God 1.0 to have an analysis module and a collection module. It also needs a data processing module. If the analysis module is the brain and the collection module is the hands and feet, then the processing module is the digestive system.

When massive data information is extracted from the ocean of data, it is necessary to process and process it into a data pattern that can be used by the data analysis module.

For example, an annual financial report of a listed company has a lot of content, ranging from personnel changes to corporate strategies, mergers and acquisitions, profit and revenue, etc., and these things are key information. A financial report of tens of thousands of words plus various icons, among which various key information must be understood and processed by the analysis module. This is the main task of processing modules.

The main functional part of this data processing module is actually natural language processing. The program itself cannot understand the connotation of the language. It must not understand what the meaning of "directed additional issuance of 1 million restricted shares" means. At this time, it is necessary to think that this sentence is assigned to make it into data information that the machine can understand.

For example, first perform meaning division, set the orientation as one meaning unit, and set the additional issuance as another unit. In this way, the entire sentence is divided according to the meaning unit and assigned values ​​separately.

This set of processing methods involves the language processing problem of human-computer interaction, which requires artificial help to computers to understand and process human language, so that machines can understand grammar and semantic units, connect context, and handle different meanings of the same phrase in different contexts.

Simply put, allowing machines to understand human language is the main goal of natural language and the main function of this processing module.

Analysis, collection and processing, these three modules are the main functional structure of Stock God 1.0, but that is not enough. Stock God still needs a lot of auxiliary modules.

For example, it needs to have a storage module. When all the data information is collected, it must be sorted out and processed, and then classified and stored. It is like a super library and must have its own classification and storage rules. Without these, you simply stack them together, and you can imagine what a painful and terrifying process it would be when you need to find a page of specific content from tens of millions of books.

In addition, the stock god also needs corresponding display and interaction modules. As a software, the stock god needs to have its own operating interface, be able to display processing results or processes, be able to receive instructions, and conduct human-computer interaction.

Only when these five modules are combined and can cooperate smoothly with each other can the stock god system be basically formed, and there will definitely be various problems in the middle, which need to be solved one by one by one. During the use process, it will definitely involve continuous revisions and improvements, and all of these will be Mo Hui's work.

According to Mo Hui's estimate, the size of the entire stock god is unlikely to be less than 1 million lines of code. If you want to make the stock god as perfect and accurate as possible, its size will definitely turn upwards. If you want to achieve any function, you must pay the corresponding price. If you want to make the stock god's predictions as accurate as possible, then it is definitely essential to keep investing in.

This is just the stock god itself. If the stock god wants to operate, then Mo Hui will inevitably face bandwidth problems. Once the crawler runs, a large amount of data will be transmitted back, and these data are at least T-level.

In the field of computers, the unit of data size is 1024-digital, one byte is byte, 1024 byte is KB, 1024K is M, 1024M is G, and 1024G is T.

For example, the storage capacity of our mobile phones may be 4G, the storage capacity of a laptop may be 400G, and the 400G of a laptop is about equivalent to a thousand movies.

The data collected by the stock god through crawlers must be huge, at least at T-level, and even if it reaches P-level, it is not a big deal. For example, the data of 1P is approximately equivalent to 2.5 million movies. A person’s life is only 30,000 days, and watching ten movies a day is enough to watch ten lifetimes.

In the face of such a large amount of data, Mo Hui will inevitably face a bandwidth problem. It is easy to imagine that the broadband in the community in the rental house will definitely not work.

The computing power of the superbook has been verified and should be relatively extraordinary, but its storage capacity has not been tested yet. If the storage capacity is not enough, Mohui must also find a storage space for this massive amount of data.

There are many other problems. If Mo Hui wants to complete the stock god and go online, he must walk forward diligently like an old ox and deal with all these road blocking stones one by one.

Originally, these things were handed over to a company and a mature team to handle them, but they may not be able to handle them well. Now, don’t need to deal with them alone, and it is very likely that they must handle them alone without showing off. The difficulty can be imagined.

Thinking about the future, Mo Hui feels like climbing Mount Everest, so high~~~

Fortunately, Mo Hui is considered an insider in the industry. These things can basically be considered as their job, and they are nothing more than project managers, product managers, main processes, and architecture. It is a bit difficult, and the workload is a bit too high, but at least there is still a solution. As long as you walk step by step along the road, there will always be a day of completion.

The workload is not small, but there are not shortcuts. Don’t click back to open the web page and start collecting the open source software he needs. He went to the Open Source Home to search and found that there were more than 100 open source crawlers, and there were probably some suitable ones.

He simply searched all the five major modules, and most of them also have similar alternative software. Now all he needs to do is to find the most suitable one in it, and then modify it and assemble it.

First of all, you need to choose a development language. Various languages ​​have their own scope of application and advantages and disadvantages. Once selected, the five major modules need to be developed from the same language, which is convenient for assembly and expansion and development.

Mo thought about it and finally chose C++ because this language is closer to the underlying layer and assembly, and the overall execution efficiency and speed are relatively good.

Mo Hui began to search and filter out common open source software online, download all the C++ developed software that basically meets the requirements, and classify and store them first.
Chapter completed!
Prev Index    Favorite Next