[This article was first published on Yet Another Blog in Statistical Computing » S+/R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don't. In [1]: # LOAD PACKAGES In [2]: import pandas as pd In [3]: import numpy as np In [4]: from sklearn import preprocessing as pp In [5]: from sklearn import cross_validation as cv In [6]: from neupy.algorithms import GRNN as grnn In [7]: from neupy.functions import mse In [8]: # DATA PROCESSING In [9]: df = pd.read_table("csdata.txt") In [10]: y = df.ix[:, 0] In [11]: y.describe() Out[11]: count 4421.000000 mean 0.090832 std 0.193872 min 0.000000 25% 0.000000 50% 0.000000 75% 0.011689 max 0.998372 Name: LEV_LT3, dtype: float64 In [12]: x = df.ix[:, 1:df.shape[1]] In [13]: st_x = pp.scale(x) In [14]: st_x.mean(axis = 0) Out[14]: array([ 1.88343648e-17, 5.76080438e-17, -1.76540780e-16, -7.71455583e-17, -3.80705294e-17, 3.79409490e-15, 4.99487355e-17, -2.97100804e-15, 3.93261537e-15, -8.70310886e-16, -1.30728071e-15]) In [15]: st_x.std(axis = 0) Out[15]: array([ 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.]) In [16]: x_train, x_test, y_train, y_test = cv.train_test_split(st_x, y, train_size = 0.7, random_state = 2015) In [17]: [...]
[Read more...]