Training, Test, and Validation Data Set Generator

Fintel Machine Learning tools reduce the effort to build and test machine learning-based algorithmic trading platforms.

This tool generates training data sets suitable for use by a variety of machine learning algorithms used to generate stock market trading strategies. The format of the training set a pipe-delimited text file of securities, where the first column is a label that indicates if the security exceeded the benchmark by the required excess returns within the lookback period. It is not enough to know if the trade was profitable, trades must beat benchmarks.

For example, if a data scientist specifies a lookback period of 10 weeks, a benchmark of "SPX", and an excess return requirement of 5%, then the training file will only set a "1" in the label column for securities in which the increase in price over the previous ten weeks exceeded the increase in price of SPX by at least 5%.

Values for features are as of lookback date calcualated from the lookback period. That date is the current date minus the number of weeks specified. If a day falls on a weekend or a holiday, it uses the value as of the previous market date. Prices are closing prices on market dates.

Limitations: This tool is designed to create files that utilize fundamental indicators such as book value and such. To that end, there is currently no way to specify multiple lookback periods for prices (ie, price-1, price-2, price-3, etc). Additionally, since ETF's do not have fundamental data, they are currently excluded from the security universe.

Have thoughts on how to improve this? Need help figuring it out? Join the Fintel Machine Learning group.

Step 1

Step 2

Step 3

Step 4

Price; MarketCap; Volume;

Note:This technology preview only shows a sample of the total file. Prodution code will allow download of entire dataset.