How to disable subwords embedding training when using fasttext? - fasttext

Here is a snippet of the corpus I try to use for training word embedding.
news_subent_12402 news_dlsub_00322 news_dlsub_00001 news_sub_00035 news_subent_07737 news_sub_00038 news_dlsub_00925 news_subent_07934 news_sub_00057 news_dlsub_01826 news_dlsub_00437 news_sub_00037 news_sub_00050 news_dlsub_00205 news_sub_00270 news_subent_05735 news_dlsub_00143 news_subent_12439 news_sub_00051 news_subent_08446 news_dlsub_00091 news_sub_00222 news_dlsub_00009 news_dlsub_00126 news_subent_15202 news_dlsub_00019 news_sub_00076 news_dlsub_00059 news_subent_11158 news_subent_10981 news_dlsub_00634 news_dlsub_00018 news_subent_03496 news_subent_16059 news_subent_08005 news_dlsub_00020 news_subent_15460 news_dlsub_00908 news_subent_12712 news_sub_00258 news_sub_00048 news_dlsub_00022 news_dlsub_00206 news_dlsub_00106 news_sub_00248 news_sub_00047 news_subent_02476 news_subent_14554 news_dlsub_00134 news_sub_00070 news_subent_06676 news_dlsub_00306 news_subent_11635 news_dlsub_01137 news_sub_00081 news_dlsub_00024 news_dlsub_00242 news_dlsub_00920 news_dlsub_00198 news_subent_02562 news_subent_09358 news_dlsub_00101 news_subent_02696 news_subent_17124 news_sub_00244 news_dlsub_00045 news_sub_00049 news_dlsub_00575 news_dlsub_00163 news_subent_03497 news_subent_10972 news_subent_05406 news_sub_00039 news_subent_14976 news_subent_20148 news_subent_02955 news_sub_00245 news_subent_02399 news_dlsub_00669 news_subent_12423 news_dlsub_00180 news_dlsub_00013 news_dlsub_00075 news_sub_00264 news_dlsub_01833 news_sub_00040 news_sub_00257 news_dlsub_00021 news_subent_14967 news_subent_03495 news_dlsub_00035 news_subent_21377 news_sub_00059 news_dlsub_01260 news_sub_00232 news_dlsub_00316 news_dlsub_00014 news_dlsub_00023 news_dlsub_00046 news_subent_02007 news_dlsub_00458 news_dlsub_00269 news_subent_04653 news_subent_06231 news_dlsub_01751 news_dlsub_00186 news_dlsub_00043 news_dlsub_00128 news_subent_05276 news_sub_00259 news_dlsub_00102 news_sub_00268 news_dlsub_00185 news_sub_00041 news_subent_09122 news_dlsub_00116 news_subent_09210 news_subent_07733 news_subent_06393 news_dlsub_00244 news_dlsub_00622 news_sub_00226 news_sub_00043 news_dlsub_00067
news_subent_03827 news_dlsub_00065 news_sub_00251 news_dlsub_01826 news_subent_17688 news_subent_07649 news_subent_02941 news_dlsub_00100 news_subent_08198 news_subent_02990 news_dlsub_00033 news_subent_02562 news_dlsub_00043 news_dlsub_00024 news_dlsub_00015 news_subent_07628 news_subent_07045 news_dlsub_00234 news_subent_09178 news_dlsub_00458 news_subent_02923 news_sub_00226 news_dlsub_00120 news_sub_00247 news_dlsub_00014 news_dlsub_01830 news_subent_02946 news_dlsub_00086 news_dlsub_00046 news_dlsub_00038 news_subent_16554 news_subent_03073 news_dlsub_00128 news_dlsub_00098 news_subent_02905 news_subent_09117 news_dlsub_00021 news_dlsub_00143 news_subent_03054 news_dlsub_00126 news_subent_16372 news_dlsub_01833 news_subent_03495 news_sub_00245 news_dlsub_00101 news_sub_00258 news_subent_11431 news_sub_00148 news_subent_09320 news_sub_00232 news_subent_02460 news_dlsub_00032 news_dlsub_00067 news_dlsub_00064 news_dlsub_00045 news_dlsub_00116 news_subent_11663 news_subent_03501 news_subent_02030 news_dlsub_00035 news_dlsub_00476 news_dlsub_00039 news_subent_14505 news_dlsub_00091 news_sub_00244 news_sub_00268 news_dlsub_00130 news_subent_02007 news_subent_03014 news_dlsub_00022 news_dlsub_00019 news_subent_09358 news_dlsub_00270 news_subent_17124 news_dlsub_00071 news_sub_00266 news_subent_06429 news_subent_02621 news_sub_00248
news_subent_03497 news_subent_03495 news_dlsub_01326 news_sub_00151 news_sub_00070 news_dlsub_00143 news_dlsub_00012 news_dlsub_00212 news_subent_04653 news_subent_02022 news_dlsub_00101 football_club_187 news_subent_02902 news_dlsub_00116 news_dlsub_00925 news_sub_00137 news_dlsub_00120 news_sub_00036 news_subent_02889 news_subent_14976 news_dlsub_00269 news_dlsub_00687 news_subent_15202 news_dlsub_00669 news_dlsub_00126 news_sub_00248 news_dlsub_00437 news_sub_00071 news_dlsub_00177 news_dlsub_00694 news_dlsub_00618 news_sub_00051 news_sub_00043 news_subent_14997 news_subent_02411 news_subent_16059 news_sub_00245 news_subent_02923 news_dlsub_00035 news_sub_00069 news_subent_05320 news_sub_00082 news_sub_00259 news_dlsub_01035 news_dlsub_00413 news_sub_00072 news_dlsub_00020 news_sub_00052 news_dlsub_00023 news_subent_03496 news_subent_02893 news_subent_16508 news_sub_00065 news_sub_00047 news_subent_05740 news_subent_13389 news_sub_00055 news_subent_09439 news_subent_02991 news_sub_00268 news_dlsub_00003 news_subent_04609 news_subent_03509 news_subent_04069 news_dlsub_00128 news_dlsub_00099 news_dlsub_00206 news_dlsub_00582 news_sub_00037 news_dlsub_00021 news_sub_00247 news_dlsub_01179 news_sub_00057 news_dlsub_00046 news_sub_00039 news_sub_00050 news_subent_03014 news_sub_00042 news_dlsub_01826 news_sub_00038 news_dlsub_00410 news_subent_12422 news_sub_00048 news_subent_13648 news_dlsub_01807 news_subent_20148 news_sub_00084 news_sub_00049 news_dlsub_00029 news_subent_11392 news_dlsub_00412 news_sub_00246 news_sub_00244 news_subent_16385 news_dlsub_00634 news_subent_13536 news_subent_03073 news_sub_00226 news_subent_11478 news_sub_00035 news_subent_14967 football_club_192 news_sub_00232 news_sub_00054 news_subent_06587 news_dlsub_00014 news_subent_02399 news_dlsub_00013 news_dlsub_00102 news_sub_00040 news_subent_01990 news_dlsub_00007 news_subent_07675 news_subent_07719 news_sub_00041 news_subent_04655 news_dlsub_00300 news_dlsub_00019 news_subent_07756 news_dlsub_00234 news_sub_00076
While that every line is a sentence and news_dlsub_00001 is just an intact word. I do not want the fasttext to construct subword embedding and what I want is just the embeddings for the intact words like news_dlsub_01326 news_subent_12402 and so on.
There are 15354 distinct words in my corpus and about 10m rows(sentences) overall.
Here is the training script :
./fasttext skipgram -input user_profile_tags_rows.txt -output model_user_tags -lr 0.01 -epoch 50 -wordNgrams 1 -bucket 200000 -dim 128 -loss hs -thread 80 -ws 5 -minCount 1
So how can I set the training script that disable the embedding representation training for subwords for efficiency ? Thanks.

If you want to train word embeddings with no subword information, you can set the -maxn parameter to 0. This means that you only use character ngrams with a max length of 0, i.e., no character ngrams are used.

Set both options to zero: -maxn 0 -minn 0

Related

How can I read skeleton pose data file and visualize it using MATLAB?

I have the human pose data file as shown in the following format and difficulties in how to visualize the skeleton data. I am currently using MATLAB and learning different ways to visualize it, but it doesn't work until now.
How to visualize it?
Hips LeftUpperLeg RightUpperLeg LeftLowerLeg RightLowerLeg LeftFoot RightFoot Spine Chest Neck Head LeftShoulder RightShoulder LeftUpperArm RightUpperArm LeftLowerArm RightLowerArm LeftHand RightHand LeftToes RightToes LeftEye RightEye Jaw LeftThumbProximal LeftThumbIntermediate LeftThumbDistal LeftIndexProximal LeftIndexIntermediate LeftIndexDistal LeftMiddleProximal LeftMiddleIntermediate LeftMiddleDistal LeftRingProximal LeftRingIntermediate LeftRingDistal LeftLittleProximal LeftLittleIntermediate LeftLittleDistal RightThumbProximal RightThumbIntermediate RightThumbDistal RightIndexProximal RightIndexIntermediate RightIndexDistal RightMiddleProximal RightMiddleIntermediate RightMiddleDistal RightRingProximal RightRingIntermediate RightRingDistal RightLittleProximal RightLittleIntermediate RightLittleDistal UpperChest LastBone
0 -6.606338 0.9278492 -4.300881 -6.585656 0.9053868 -4.398254 -6.618386 0.9093576 -4.201249 -6.71897 0.4772302 -4.432689 -6.68056 0.4624649 -4.204533 -6.610827 0.09348767 -4.409986 -6.552395 0.08352266 -4.226445 -6.605658 1.099044 -4.304219 -6.605563 1.236343 -4.304299 -6.639235 1.466708 -4.309548 -6.683933 1.555007 -4.316747 -6.60193 1.405864 -4.37484 -6.623822 1.405828 -4.235976 -6.568427 1.382382 -4.483674 -6.624481 1.378808 -4.122891 -6.518499 1.102309 -4.523542 -6.581257 1.099841 -4.069668 -6.529497 0.8499395 -4.549026 -6.634782 0.8523263 -4.051447 -6.74858 0.01387946 -4.475498 -6.701818 0.006144805 -4.190523 -6.756795 1.6316 -4.352144 -6.759987 1.631549 -4.288719 0 0 0 -6.557372 0.8228443 -4.545695 -6.584511 0.7927774 -4.529844 -6.59354 0.7646196 -4.51826 -6.568016 0.7513906 -4.558767 -6.575216 0.7095199 -4.560191 -6.5817 0.6861121 -4.553126 -6.547763 0.744909 -4.55823 -6.551714 0.6964812 -4.552429 -6.556111 0.6726769 -4.540556 -6.528478 0.7442315 -4.550719 -6.530251 0.7011524 -4.545532 -6.53367 0.6771137 -4.534364 -6.509906 0.7459834 -4.542469 -6.511012 0.7105113 -4.537654 -6.512474 0.6924514 -4.531447 -6.665133 0.8302335 -4.062074 -6.694083 0.8066689 -4.0844 -6.704619 0.7802257 -4.098482 -6.689557 0.7613502 -4.055117 -6.704025 0.7215727 -4.059046 -6.712921 0.701099 -4.070944 -6.670976 0.7516301 -4.051549 -6.680717 0.7050958 -4.063133 -6.685835 0.6843727 -4.079605 -6.650858 0.7478803 -4.05471 -6.657919 0.7060179 -4.063848 -6.662397 0.6840374 -4.078377 -6.630877 0.746653 -4.058627 -6.636095 0.7121329 -4.066616 -6.638836 0.6952109 -4.075157 0 0 0 0 0 0

Getting faulty FFT results in MATLAB

I am working on a signal. The problem I am facing is that when I apply FFT at a higher resolution, the results are distorted or inaccurate.
% This is the Signal
ppg=[112.876329616814
112.658408873981
112.621201482138
112.229109437548
111.965434300402
111.819245235084
111.841263181141
112.034975558851
112.229725672171
112.401002397509
112.483441565446
112.537095785523
112.542167639986
112.517623071764
112.530208400404
112.494331187315
112.458340941558
112.491893571651
112.446357268045
112.437848223891
112.468898497367
112.492229975598
112.513807423899
112.558165404527
112.585333504471
112.632528318017
112.602232391053
112.667122234561
112.702144016400
112.742078840809
112.770641027009
112.705587358155
112.725453248592
112.610860932793
112.505600920437
112.487277414681
112.411549384328
112.371246006390
112.360145209525
112.353754972283
112.326490619891
112.334771225799
112.308532889401
112.351352504480
112.374822910342
112.400805703872
112.377898361460
112.427400530787
112.441468011030
112.487825137776
112.439115590095
112.466759319904
112.452625972089
112.406563665424
112.280788492275
112.293628779368
112.168353755994
112.159275439531
112.124102673341
112.135144200310
112.114654566371
112.100083083019
112.080270978462
112.120150827635
112.134439453760
112.156685466933
112.225453576662
112.287027263325
112.333599889113
112.355033113993
112.378141995004
112.429045612203
112.457869702086
112.446108693452
112.182607770090
112.172291031572
112.141941738264
112.135048950453
112.071363786720
112.093823632894
112.118452285912
112.154260471553
112.122354546715
112.204188788361
112.260325725976
112.317032864848
112.257972343352
112.363700230453
112.436783319341
112.457697907478
112.438722450783
112.430362124392
112.355503599989
112.242165740932
112.166705570292
112.163616654977
112.085798076462
112.184362698808
112.153380937729
112.262361656787
112.224531961545
112.244349806038
112.299966267499
112.255424005753
112.312405810283
112.331168831169
112.383220614130
112.364187970045
112.365720855222
112.410753157449
112.456950646392
112.432425527838
112.372746578952
112.211072664360
112.188389081123
112.172113289760
112.151095732411
112.158093342346
112.287870215162
112.293031150688
112.266507326308
112.271329839831
112.304199076742
112.265933856062
112.310326900020
112.339695911893
112.378147248247
112.473536985823
112.514993681027
112.529122799596
112.528742074844
112.605329735200
112.526943835385
112.462135267330
112.385319738140
112.441471501166
112.448451454950
112.449974353958
112.425600543248
112.436932348697
112.427383074442
112.447754859520
112.503825754972
112.486296171361
112.388719124013
112.426279602750
112.464391817333
112.495840760547
112.551141668789
112.513411425176
112.628942486085
112.571133412043
112.408067990111
112.379208267462
112.387577797402
112.394219329184
112.364188099482
112.405313640608
112.384305237246
112.374023983266
112.386142882234
112.416160605464
112.480669827844
112.498013584519
112.543765218506
112.551326412918
112.570091215874
112.521257936741
112.468922209407
112.417916186082
112.428681276432
112.453331488272
112.479538244438
112.488535687381
112.488068838927
112.489596706596
112.501862088722
112.473300725428
112.486755477968
112.452806047650
112.546537015539
112.595369631167
112.660854755442
112.715084510180
112.668289105923
112.564813138360
112.491403530284
112.498697082319
112.526036994318
112.521722414456
112.606011734873
112.613438869378
112.706794350468
112.709914268738
112.733341821577
112.788090993973
112.816951022833
112.840548340548
112.899414311179
112.964519140990
112.982683982684
112.965834818776
112.868129587150
112.819760603335
112.801068716104
112.895339954163
112.905992700110
112.922162804516
112.960653154463
113.001264968797
112.995510620634
112.925014237129
112.945664727579
112.950660002563
112.977230979538
113.027126318937
113.051945832799
113.176574533880
113.106155752061
113.102652825836
113.047802127387
113.064266941043
113.117979232419
113.114727965440
113.159112129700
113.113070092937
113.136957813428
113.121019379042
113.399967215369
113.367061789600
113.352146693361
113.305748879779
113.282189807311
113.220978471920
113.180702182517
113.104017968821
113.072536204024
113.083557606049
113.054936135674
113.066598316887
113.074672134649
113.116066470161
113.150241360161
113.135717031911
113.158955957111
113.205006621385
113.220812508010
113.243111623734
113.224913494810
113.259296712421
113.241617859265
113.191664544606
113.174093879976
113.077189191491
112.950745439788
112.943312401213
112.979259030598
113.077073253544
113.066505390035
113.098221698138
113.112731541459
113.102715466352
113.106468207118
113.153820205768
113.150868611908
113.173089897116
113.190293472761
113.178402766065
113.102167313206
112.926472532296
112.876893709052
112.789109583227
112.758169934641
112.724514048043
112.742182884575
112.739968811373
112.716794140972
112.710236798825
112.740984371850
112.784984456069
112.821738161678
112.854162908890
112.815290685626
112.797747606299
112.842966380452
112.773617048632
112.618259350926
112.600013906049
112.505248769236
112.509846150566
112.570719946351
112.616040728203
112.521252509718
112.567446968445
112.599698745483
112.592843342328
112.646111102240
112.596601770668
112.669232120331
112.641172136137
112.655610256191
112.661025833021
112.679463209493
112.738903840403
112.870380719711
112.912401324166
112.846829640947
112.798192004074
112.701758108839
112.530854766149
112.482880849863
112.499466017344
112.508394207356
112.488060147806
112.514289375881
112.493186381306
112.543918757086
112.665845288267
112.688900287130
112.807218755271
112.834668578175
112.703031331897
112.722244399380
112.785603653376
112.802255542740
112.855312059464
113.013241660300
113.030387912741
112.873026095195
112.718142594729
112.708659062754
112.668631722842
112.641462685292
112.656627792729
112.647315135204
112.655517108804
112.647229697979
112.676961852279
112.704985262079
112.707932846341
112.761287364656
112.732347290155
112.734565047584
112.867568213821
112.931068507103
112.988718817859
113.096054398748
113.131037108322
113.183455193348
113.132948220611
113.114479676168
113.036346770113
113.004849047057
112.972971833361
112.874736268792
112.791262804419
112.823775318571
112.846056722087
112.780428706548
112.763848953305
112.728996296355
112.773206229775
112.818483253410
112.864382694037
112.884692557840
112.890568367494
113.008012709447
113.036289097277
113.060172622443
113.033218358328
112.958648383101
112.867703803842
112.856821465318
112.857759746664
112.812466347153
112.872717982564
112.837206331634
112.842406051915
112.787900435554
112.820527036897
112.793913965998
112.821891809763
112.848107872062
112.864297202015
112.906432063394
112.934805994312
112.981381931212
112.980988998115
112.895035105583
112.860105450029
112.734561363358
112.710121401466
112.679145097747
112.668994599551
112.650356920091
112.639027724380
112.633687897817
112.596479986330
112.606518860267
112.616728608655
112.644709299842
112.671664744329
112.718424966814
112.752544288479
112.800183391354
112.828843621553
112.836520468927
112.803382077760
112.756687708242
112.647200882837
112.720451876531
112.670787365554
112.628967979851
112.599208268020
112.606262959222
112.650391603175
112.636879961639
112.698316834841
112.726519381039
112.739990757021
112.749811260567
112.763804561646
112.805069697711
112.804922370644
112.787026705349
112.659918037149
112.610357123849
112.557432540316
112.647731010409
112.613608632452
112.721092680038
112.671744081352
112.679526699929
112.647638618922
112.596299125711
112.617944147356
112.659663865546
112.672608437314
112.815456927925
112.851304541265
112.844278288642
112.858829482206
112.776522179120
112.762180282434
112.665898203276
112.590707988075
112.541139358148
112.539482418919
112.496684314953
112.622777795404
112.623454635828
112.685556206676
112.762185866082
112.779739938865
112.826916274674
112.853017416599
112.912892207147
112.941246033883
112.824242167026
112.738681577523
112.694177285142
112.670372062871
112.650307661243
112.731169561481
112.723954630818
112.750522551963
112.748358072765
112.786045475284
112.788209954482
112.717587252766
112.748643684053
112.803537101115
112.848135332564
112.996763891394
113.041028379858
113.026433931036
112.865621916413
112.784671882484
112.758984310443
112.742009947915
112.811968768794
112.795005180797
112.850054464784
112.833737831853
112.863392795501
112.917648809208
112.922372245514
112.964116624954
112.952604995503
112.978151993996
113.111056881240
113.091364364609
113.087417373129
113.022313234093
113.048170783465
113.005984211867
112.918286860859
112.845102311077
112.917435294243
112.912325894544
112.986173841887
112.982177965291
112.969758901485
113.015487785406
113.035499494385
113.073436585236
113.091114353573
113.143344019256
113.152078343712
113.138629565365
113.029550531249
112.999786406938
113.001871231548
112.944465472846
112.925657196778
112.934570869563
112.898286983639
112.883933529839
112.917383628208
112.930283224401
112.954291084626
112.966045943799
113.022342824547
112.982367248748
113.058469521207
113.086284808983
113.120224563791
113.109032812815
113.087716690906
112.994595973283
112.973766293114
112.963393962581
112.926665710202
112.837771418237
112.826324244391
112.789232330639
112.760623402571
112.776947070295
112.770400599385
112.789997502564
112.834024344161
112.869151467090
112.917605607566
112.979117293856
112.991755880699
113.099321569644
113.117088724731
113.111462458953
113.077239534035
113.051189070670
112.982163939956
112.733214497920
112.660555131143
112.666393455711
112.657877789546
112.629605777879
112.607975985821
112.594947016590
112.720315398887
112.731953111823
112.676844003321
112.589539568575
112.565206994947
112.649276966720
112.690694195410
112.811581244345
112.779338429967
112.782398694936
112.670864180447
112.586601663748
112.472449158812
112.419992655238
112.385291315617
112.420904481893
112.401797661760
112.370877026501
112.368957814657
112.355779226664
112.339231355657
112.379705400982
112.323382422601
112.426832714032
112.464321318712
112.489143125223
112.519381374049
112.572308971782
112.602206027391
112.663735749034
112.648479760247
112.670776347760
112.673502086054
112.581578209853
112.445712951862
112.365070087250
112.285572864830
112.283231056635
112.352623189410
112.346652308119
112.261601264576
112.336333783224
112.319318651285
112.430092061360
112.539214032721
112.561519649182
112.602462472592
112.659892055996
112.698304941811
112.689704039384
112.718808331527
112.742074844141
112.769783145527
112.848751897453
112.852209478833
112.753921403272
112.597653254932
112.542998101664
112.424751718869
112.416136151430
112.383375290730
112.383843652369
112.432141484045
112.376051817829
112.401556237992
112.435193119342
112.484551396316
112.568140153991
112.584591350369
112.616983628509
112.675009166122
112.691247814743
112.678780137314
112.724381286923
112.762318377774
112.794712104655
112.772516524673
112.740466006004
112.687882249794
112.674818441755
112.640000000000
112.634551283412
112.631111865990
112.640299595185
112.624341044859
112.671663466091
112.631888645187
112.650431733043
112.652424290808
112.670477406993
112.712800042578
112.747032838363
112.819220483702
112.872867089045
112.895640841512
112.982858114499
112.951486232292
112.873234339241
112.770168190420
112.730335581203
112.705770717275
112.733941827399
112.723419211230
112.694379029250
112.707733248177
112.768700217330
112.721197161699
112.725248831147
112.766703806970
112.794789872040
112.816438122090
112.878081079927
112.903840403264
112.937225638865
112.960049445338
112.964184098635
112.933323387432
112.831712694237
112.793200771091
112.782526642392
112.729847494553
112.734418386091
112.733778661563
112.714714490583
112.762150491777
112.803538259291
112.881079704609
112.870258503691
112.884309352863
112.926109837875
112.882563641299
112.843088438348
112.881174131157
112.887102363296
112.799948737665
112.723396984066
112.740602525896
112.713009138277
112.673583790719
112.647579884840
112.640203008882
112.649798055587
112.665700956479
112.664807442573
112.702267366497
112.702138070418
112.707149686604
112.704724660475
112.714009147963
112.776382603523
112.762589621939
112.777789569758
112.748628452767
112.652080195432
112.583522410266
112.587021198461
112.549789143972
112.629551936017
112.598230764335
112.580138077536
112.599350332265
112.593478122142
112.629726455398
112.622927088877
112.645608358037
112.700083079479
112.736548730932
112.748022658972
112.763486359480
112.686832967642
112.630338921671
112.594039579916
112.567010034585
112.567051443446
112.664309298115
112.659111726488
112.672446479923
112.641678905136
112.629395621929
112.687345454159
112.710956070428
112.723440645466
112.764388384002
112.820705902373
112.860454374107
112.830046828709
112.803797266733
112.761874422605
112.747494713127
112.730746750832
112.774562906367
112.782335861137
112.743012414087
112.786450631403
112.808693110869
112.776847807001
112.801021874501
112.831963873964
112.895573864725
112.939550729266
112.975818286225
112.995294368146
112.958335994550
112.885950905878
112.860914181090
112.830554313337
112.824644448936
112.817971356952
112.786524505735
112.826588980798
112.837622457870
112.868628774630
112.899616134050
112.916733475978
112.958273824377
112.960524988452
113.000366750116
113.059006013216
113.075905873106
113.005940678236
113.051680712456
113.008421164133
113.030962343096
112.963405447273
112.916720601810
112.872636070417
112.913275201522
112.950258511209
113.012960099472
113.055979765454
113.097724144894
113.130966614061
113.159350060042
113.182694453596
113.279546732591
113.244320894651
113.284396101802
113.235240306060
113.171662298804
113.090696884815
113.107843137255
113.092436974790
113.093752652576
113.092182327476
113.099185128597
113.227498902772
113.197294739124
113.227188396020
113.218671359899
113.283900096563
113.317013336117
113.378421900161
113.399042194511
113.381879480963
113.512548477458
113.439232431426
113.321390905303
113.323803880143
113.361110967975
113.412681318115
113.366601492607
113.347833527810
113.362395954881
113.370822060354
113.384016649324
113.451904266389
113.580478668054
113.728158168574
113.827679500520
113.865265348595
113.809698231009
113.761331945890
113.697776944002
113.674306141805
113.628662795854
113.641871669465
113.609810316487
113.597728447128
113.606890676367
113.644694558964
113.662243692295
113.647199171210
113.617383840816
113.651233177986
113.704795233537
113.723400584505
113.734986119316
113.731371419595
113.656888516521
113.558812113837
113.470775502485
113.379366863843
113.381950927839
113.364125104245
113.361062011885
113.337396476395
113.348935522955
113.362446697472
113.357621278002
113.322746144253
113.343246983546
113.389510684536
113.434052960027
113.486422668241
113.510541406645
113.558399392815
113.603896103896
113.674122952416
113.680395283123
113.574607584679
113.491860574955
113.435709971375
113.305364634473
113.214855952794
113.173845781459
113.152789636235
113.110105898399
113.047487709621
113.056834021513
112.979051308504
112.981518836672
112.896989257519
112.890903341415
112.905779822322
112.939854537789
112.916809152893
112.967830397637
112.979242772980
113.016871845767
113.058077873788
113.087556824189
113.102476930549
112.977970585123
112.850536746491
112.711549161434
112.653528168678
112.567575999535
112.448132075672
112.387445887446
112.501011975038
112.449148254343
112.442846416599
112.363574634229
112.380932171920
112.431016627810
112.462349733154
112.549883216550
112.573139897911
112.600575641766
112.635347372773
112.668974180930
112.736447410232
112.691326029347
112.723530130272
112.745652658271
112.729972397601
112.546976625747
112.501247292237
112.469553887791
112.392396548893
112.348796932350
112.374886972339
112.345571166445
112.300013793250
112.293053507199
112.218810042099
112.245591812187
112.279271271868
112.302902245475
112.327810569006
112.388697582083
112.445113870424
112.538805130871
112.526097855649
112.669085816145
112.697988286224
112.766906558513
112.739500758981
112.602941098708
112.537984890926
112.521254488057
112.437173216394
112.441516275932
112.442275253420
112.411637382226
112.395869136442
112.444813418432
112.426294484736
112.441558441558
112.466098836229
112.511511216057
112.577205262270
112.612793051105
112.650531286895
112.698255415591
112.649981756939
112.711361704599
112.728677996167
112.662580444205
112.593273043074
112.544956366113
112.532201439153
112.500652000612
112.481677492204
112.441406703922
112.457461763680
112.458517084933
112.466004632127
112.475141977855
112.581024934404
112.610842743788
112.666170407091
112.705448824551
112.735562277937
112.783287666894
112.854903863394
112.874816744539
112.910592429308
112.891203288884
112.795108838876
112.711074509307
112.707363094924
112.640407144026
112.581805113678
112.594477405163
112.586524530178
112.653450815388
112.590329602814
112.742738282529
112.710432386986
112.737506147118
112.868991398212
112.945437679204
112.957117557767
113.008686119076
113.039551357733
113.068519143194
113.023907910272
112.943498060381
112.874936751560
112.764622210001
112.712248535778
112.702105084458
112.799352767922
112.785688979605
112.778497512070
112.777820668066
112.791399850883
112.807347987711
112.837890573361
112.929372575476
112.981826614944
112.996753246753
113.072061055827
113.084921571935
113.063501433631
113.011384719177
112.923132062742
112.873882610896
112.886152808231
112.881388092427
112.867852926294
112.916006071850
112.873713948389
112.903398549502
112.927770281666
112.945522010457
112.944678697925
112.982416933716
113.025334565131
113.056032434829
113.047535453305
113.107732070695
113.095308986095
113.032213106965
112.923251293622
112.960412190448
112.905031297045
112.916363559577
112.897297411561
112.903592681457
112.956315625544
112.910176347725
112.920159691193
112.983844740566
112.976755007509
112.949639374775
112.909763157339
113.012684409362
113.005972858319
113.011043514507
112.989701154272
112.916285233110
112.902571478428
112.863145315235
112.855277003696
112.857645957708
112.849429328603
112.880512706809
112.838292803761
112.885120760195
112.924818687806
112.970484061393
112.993127002867
113.021757463316
113.083024118738
113.109040310339
113.113467701130
113.140411536515
113.079735199865
113.015053128689
112.948220610558
112.913180974869
112.889357395851
112.920349131388
112.928107606679
112.843684868695
112.834049925524
112.841148942014
112.950187042616
112.913744836839
112.969818615903
112.989847983858
113.030202512122
113.064014443635
113.106541898322
113.090548481364
113.003111874257
112.913253203012
112.884900251812
112.864191657284
112.792375304458
112.843887206962
112.834250027739
112.801771941788
112.886278403989
112.885010722473
112.902927287901
112.963148184370
112.982390848105
113.061101571787
113.061566503764
113.073500089673
113.036153984112
112.957925522368
112.887396661493
112.842334209665
112.784268118623
112.738200422198
112.760106363295
112.785861928601
112.799801338849
112.888949940563
112.912212832749
112.933104776416
112.934267036199
113.003531177976
113.073671631689
113.060981477663
113.113893078245
113.048182658044
112.967801132708
112.895467863270
112.864553253266
112.853150323347
112.847699478908
112.822404645638
112.816829879900
112.801256391681
112.797914474415
112.815669325537
112.796273700517
112.891981929019
112.904777793667
112.894693842906
112.800232281906
112.762425919861
112.749383661080
112.750566187145
112.775736575781
112.823823961293
112.845966131776
112.886278403989
112.912398509180
112.947831499436
112.879348411701
112.800371853245
112.754045529709
112.670261583032
112.655782729717
112.627535542519
112.604494869583
112.630485818828
112.646390727776
112.645925424589
112.672659207716
112.702015608807
112.721093039487
112.743468787680
112.770229411143
112.817676871773
112.766263913492
112.697818787510
112.614623367149
112.602999994716
112.577427655112
112.579201032700
112.560641192601
112.578947089975
112.606965437337
112.631891563400
112.662072758619
112.679038135078
112.794822252489
112.832685606722
112.863078265930
112.888521865256
112.864477611940
112.812711842001
112.686404424868
112.678558796357
112.641732720649
112.578676011307
112.615560519602
112.614374794880
112.612045692749
112.610309452979
112.649268979536
112.663335223389
112.692162982825
112.747426091099
112.772229260200
112.803057368056
112.788210508485
112.783006992267
112.722104127669
112.613386634517
112.617308241566
112.577745337607
112.561453312854
112.597722578253
112.623438253068
112.610337625254
112.618459667398
112.655389577770
112.671083732121
112.698245502713
112.827257081921
112.831670394399
112.909570447734
112.935303193625
112.880180922791
112.776615306575
112.747320126012
112.652852207862
112.659744130376
112.651878812246
112.600182284491
112.641136974986
112.697833933110
112.700793185265
112.696147152024
112.714176131705
112.751842309910
112.804024779267
112.818287141126
112.873332522616
112.878427725137
112.861686105195
112.792101801816
112.693639174440
112.672878623715
112.695034486404
112.623252909392
112.634868891064
112.671595647418
112.686646830954
112.713207786443
112.739936183251
112.764659890539
112.819052046574
112.847104557074
112.889001627666
112.862724767099
112.891866008455
112.950558719783
112.921001152573
112.872900258999
112.794438507201
112.737104218323
112.731066403292
112.692672959138
112.666673733979
112.666744459885
112.681328231815
112.713788370947
112.772031877145
112.767795954899
112.812621477484
112.858661898439
112.909132239051
112.912673253703
112.943186509575
112.983010969212
113.042426421219
112.953027946203
112.919942268346
112.805392421405
112.789054315736
112.778985008940
112.779773926468
112.762175041034
112.732118079461
112.731847302377
112.729404543963
112.773917917156
112.794378776495
112.835505936334
112.878070959181
112.913640424846
112.965060450966
113.006860827739
113.036291558750
113.091496300674
113.047187191520
113.014839676609
112.890575692062
112.799196234996
112.730050897855
112.763251630360
112.671648277105
112.687937594928
112.693620578370
112.712914457462
112.732441436201
112.777377239812
112.777207386769
112.803838077153
112.873979058145
112.898788077301
112.952209883457
112.986443794047
112.998658115233
112.975398264713
112.925978162686
112.844087896756
112.769369121552
112.739309236608
112.722380292565
112.701140796972
112.685846091989
112.697805861620
112.688607380507
112.695595100727
112.750064739115
112.782810574691
112.822130874347
112.864538860762
112.839426181780
112.896561207544
112.941422837531
112.972688419191
112.982433098013
112.984937925193
112.987531013760
112.876596871112
112.792115283787
112.759534161984
112.736528246329
112.645863154881
112.660371856792
112.641582856628
112.613123671277
112.642007226184
112.680703314385
112.663567533224
112.731067401163
112.799277035847
112.818936587377
112.858213411831
112.843781937568
112.890465725860
112.904651347278
112.954312681996
112.905692285740
112.849322596327
112.793270348530
112.734855377584
112.652752868729
112.665888035683
112.619094498608
112.637108627380
112.610155861471
112.634085981704
112.689817726362
112.698928992238
112.700286005805
112.718379096311
112.769973893099
112.826288697932
112.829268034526
112.860185939268
112.889748549323
112.907650487080
112.832690629215
112.739543474694
112.691882280095
112.604031605074
112.586744216788
112.575068513443
112.579530894266
112.594114721661
112.601844560133
112.606148366196
112.628071381362
112.632981336602
112.647011901680
112.661046173375
112.774216754032
112.830412635610
112.849116385352
112.868299027668
112.914696629213
112.753143298440
112.688562535341
112.664109739775
112.667961493885
112.694501492664
112.658762253605
112.672004405753
112.660620059760
112.675877852889
112.674426971999
112.736409995980
112.741841899075
112.742419433335
112.792756964864
112.832110615446
112.869024715110
112.889495167396
112.912854997991
112.933389669378
112.805691005425
112.698962668751
112.659793378827
112.626032577694
112.646348427487
112.711601462600
112.663522345128
112.715546992441
112.721602337448
112.750756085697
112.751197362363
112.733124625889
112.747608629018
112.771385518281
112.779097688687
112.799550956864
112.848592781046
112.919277127550
112.906771905137
112.928969204652
112.858294363445
112.849042939532
112.746521700044
112.691440024114
112.662444377424
112.728673719418
112.744762035807
112.731252610597
112.706423522072
112.687772475784
112.703116324081
112.686785347846
112.730461401103
112.643943770460
112.672834711054
112.731149712625
112.820871121146
112.859066971647
112.912744837836
112.843613515392
112.855656007885
112.766372375492
112.719626267227
112.721040414640
112.667989263625
112.643333951910
112.677352138066
112.742010086951
112.706234751822
112.709704730659
112.728877039294
112.744113096374
112.767936245523
112.800770419286
112.858292223328
112.867948473888
112.865343563080
112.840126188309
112.861425758506
112.870724122612
112.746474406411
112.674705421929
112.634342408665
112.602645782623
112.593931951473
112.604358210217
112.643647519306
112.661250396699
112.653322895288
112.645027595103
112.670175234448
112.614255576763
112.571695053577
112.642482771751
112.649653347446
112.732807621064
112.740168643342
112.766906003151
112.721348243285
112.673965100728
112.574154389042
112.518313789113
112.496391275611
112.490913801399
112.527866128273
112.505977550448
112.520531279588
112.526394870221
112.454569502788
112.462915235580
112.443148070703
112.492045268165
112.509187490410
112.560360198300
112.600307899039
112.644358585405
112.654760331714
112.648286289790
112.515672803108
112.431449350539
112.406233116876
112.326448341343
112.338243247542
112.321748439272
112.338723530597
112.355109090716
112.362145150523
112.356283202732
112.350968727134
112.399291975368
112.414724368554
112.445467369515
112.510425177467
112.543168482816
112.625077483033
112.610657868214
112.563606010017
112.504964659256
112.456854620789
112.409611735900
112.374617364502
112.441843483782
112.403548752955
112.412858126357
112.423915691363
112.415915231788
112.417626773881
112.483904630233
112.519247170001
112.469354573909
112.493885743641
112.565511621878
112.559966560316
112.577647613348
112.517781390841
112.377592856123
112.343035596136
112.334369588128
112.349533669437
112.416761473376
112.383361290288
112.398389256063
112.379469746311
112.395694005390
112.400989129111
112.470376430345
112.488404469877
112.495812608260
112.505389843258
112.561216979683
112.591114469769
112.619121801811
112.516461960631
112.427560080001
112.333125218256
112.293336437423
112.250013239983
112.242514113821
112.296787099711
112.282709315257
112.236650247350
112.218886260198
112.232105201810
112.243210308251
112.283822859624
112.318899848496
112.330765889370
112.395738591191
112.420077383792
112.470084655076
112.477463975724
112.465540143629
112.347171731089
112.314996982179
112.312381597468
112.233054575132
112.212605692825
112.290747170611
112.342638801420
112.357445095827
112.368607552027
112.371789714383
112.380696369986
112.393493317518
112.406201681740
112.443310105268
112.468193492017
112.551605814292
112.593362511047
112.635954211805
112.684823750819
112.612275987559
112.494043966739
112.412144566069
112.369511395137
112.357544976459
112.372275173717
112.387008684516
112.395791121997
112.383240631466
112.381022878081
112.416690940097
112.435019607636
112.416736853815
112.480348698200
112.422027094300
112.486599248160
112.501021996759
112.528938923779
112.582434939312
112.585154040859
112.583967566767
112.463769958573
112.427055155291
112.371582033839
112.307583101576
112.305029596774
112.306598199287
112.351449006623
112.348290157086
112.335625830567
112.297951267077];
% plotting frequency domain
plot(abs(fft(ppg,1024)));
the result i get for 1024 point resolution is,
% plotting frequency domain
plot(abs(fft(ppg,2048)));
the result I get for 2048 point resolution is,
In my knowledge, increasing resolution does not distort the fft result. If someone knows why is this happening please let me know.
Note: I am using property editor from figure window to zoom out my results.
fft(ppg,2048) pads ppg with zeros to make it 2048 elements long, then applies the FFT. Padding your signal with zeros is bad because it introduces a big jump, your signal is all close to 112. This jump will influence the output of the FFT much more than the actual data.
If you want to increase the resolution of the frequency domain that you compute, you will have to sample your signal for longer. There are no shortcuts.

GloVe embeddings - unknown / out-of-vocabulary token [duplicate]

I found "unk" token in the glove vector file glove.6B.50d.txt downloaded from https://nlp.stanford.edu/projects/glove/. Its value is as follows:
unk -0.79149 0.86617 0.11998 0.00092287 0.2776 -0.49185 0.50195 0.00060792 -0.25845 0.17865 0.2535 0.76572 0.50664 0.4025 -0.0021388 -0.28397 -0.50324 0.30449 0.51779 0.01509 -0.35031 -1.1278 0.33253 -0.3525 0.041326 1.0863 0.03391 0.33564 0.49745 -0.070131 -1.2192 -0.48512 -0.038512 -0.13554 -0.1638 0.52321 -0.31318 -0.1655 0.11909 -0.15115 -0.15621 -0.62655 -0.62336 -0.4215 0.41873 -0.92472 1.1049 -0.29996 -0.0063003 0.3954
Is it a token to be used for unknown words or is it some kind of abbreviation?
The unk token in the pretrained GloVe files is not an unknown token!
See this google groups thread where Jeffrey Pennington (GloVe author) writes:
The pre-trained vectors do not have an unknown token, and currently the code just ignores out-of-vocabulary words when producing the co-occurrence counts.
It's an embedding learned like any other on occurrences of "unk" in the corpus (which appears to happen occasionally!)
Instead, Pennington suggests (in the same post):
...I've found that just taking an average of all or a subset of the word vectors produces a good unknown vector.
You can do that with the following code (should work with any pretrained GloVe file):
import numpy as np
GLOVE_FILE = 'glove.6B.50d.txt'
# Get number of vectors and hidden dim
with open(GLOVE_FILE, 'r') as f:
for i, line in enumerate(f):
pass
n_vec = i + 1
hidden_dim = len(line.split(' ')) - 1
vecs = np.zeros((n_vec, hidden_dim), dtype=np.float32)
with open(GLOVE_FILE, 'r') as f:
for i, line in enumerate(f):
vecs[i] = np.array([float(n) for n in line.split(' ')[1:]], dtype=np.float32)
average_vec = np.mean(vecs, axis=0)
print(average_vec)
For glove.6B.50d.txt this gives:
[-0.12920076 -0.28866628 -0.01224866 -0.05676644 -0.20210965 -0.08389011
0.33359843 0.16045167 0.03867431 0.17833012 0.04696583 -0.00285802
0.29099807 0.04613704 -0.20923874 -0.06613114 -0.06822549 0.07665912
0.3134014 0.17848536 -0.1225775 -0.09916984 -0.07495987 0.06413227
0.14441176 0.60894334 0.17463093 0.05335403 -0.01273871 0.03474107
-0.8123879 -0.04688699 0.20193407 0.2031118 -0.03935686 0.06967544
-0.01553638 -0.03405238 -0.06528071 0.12250231 0.13991883 -0.17446303
-0.08011883 0.0849521 -0.01041659 -0.13705009 0.20127155 0.10069408
0.00653003 0.01685157]
And because it is fairly compute intensive to do this with the larger glove files, I went ahead and computed the vector for glove.840B.300d.txt for you:
0.22418134 -0.28881392 0.13854356 0.00365387 -0.12870757 0.10243822 0.061626635 0.07318011 -0.061350107 -1.3477012 0.42037755 -0.063593924 -0.09683349 0.18086134 0.23704372 0.014126852 0.170096 -1.1491593 0.31497982 0.06622181 0.024687296 0.076693475 0.13851812 0.021302193 -0.06640582 -0.010336159 0.13523154 -0.042144544 -0.11938788 0.006948221 0.13333307 -0.18276379 0.052385733 0.008943111 -0.23957317 0.08500333 -0.006894406 0.0015864656 0.063391194 0.19177166 -0.13113557 -0.11295479 -0.14276934 0.03413971 -0.034278486 -0.051366422 0.18891625 -0.16673574 -0.057783455 0.036823478 0.08078679 0.022949161 0.033298038 0.011784158 0.05643189 -0.042776518 0.011959623 0.011552498 -0.0007971594 0.11300405 -0.031369694 -0.0061559738 -0.009043574 -0.415336 -0.18870236 0.13708843 0.005911723 -0.113035575 -0.030096142 -0.23908928 -0.05354085 -0.044904727 -0.20228513 0.0065645403 -0.09578946 -0.07391877 -0.06487607 0.111740574 -0.048649278 -0.16565254 -0.052037314 -0.078968436 0.13684988 0.0757494 -0.006275573 0.28693774 0.52017444 -0.0877165 -0.33010918 -0.1359622 0.114895485 -0.09744406 0.06269521 0.12118575 -0.08026362 0.35256687 -0.060017522 -0.04889904 -0.06828978 0.088740796 0.003964443 -0.0766291 0.1263925 0.07809314 -0.023164088 -0.5680669 -0.037892066 -0.1350967 -0.11351585 -0.111434504 -0.0905027 0.25174105 -0.14841858 0.034635577 -0.07334565 0.06320108 -0.038343467 -0.05413284 0.042197507 -0.090380974 -0.070528865 -0.009174437 0.009069661 0.1405178 0.02958134 -0.036431845 -0.08625681 0.042951006 0.08230793 0.0903314 -0.12279937 -0.013899368 0.048119213 0.08678239 -0.14450377 -0.04424887 0.018319942 0.015026873 -0.100526 0.06021201 0.74059093 -0.0016333034 -0.24960588 -0.023739101 0.016396184 0.11928964 0.13950661 -0.031624354 -0.01645025 0.14079992 -0.0002824564 -0.08052984 -0.0021310581 -0.025350995 0.086938225 0.14308536 0.17146006 -0.13943303 0.048792403 0.09274929 -0.053167373 0.031103406 0.012354865 0.21057427 0.32618305 0.18015954 -0.15881181 0.15322933 -0.22558987 -0.04200665 0.0084689725 0.038156632 0.15188617 0.13274793 0.113756925 -0.095273495 -0.049490947 -0.10265804 -0.27064866 -0.034567792 -0.018810693 -0.0010360252 0.10340131 0.13883452 0.21131058 -0.01981019 0.1833468 -0.10751636 -0.03128868 0.02518242 0.23232952 0.042052146 0.11731903 -0.15506615 0.0063580726 -0.15429358 0.1511722 0.12745973 0.2576985 -0.25486213 -0.0709463 0.17983761 0.054027 -0.09884228 -0.24595179 -0.093028545 -0.028203879 0.094398156 0.09233813 0.029291354 0.13110267 0.15682974 -0.016919162 0.23927948 -0.1343307 -0.22422817 0.14634751 -0.064993896 0.4703685 -0.027190214 0.06224946 -0.091360025 0.21490277 -0.19562101 -0.10032754 -0.09056772 -0.06203493 -0.18876675 -0.10963594 -0.27734384 0.12616494 -0.02217992 -0.16058226 -0.080475815 0.026953284 0.110732645 0.014894041 0.09416802 0.14299914 -0.1594008 -0.066080004 -0.007995227 -0.11668856 -0.13081996 -0.09237365 0.14741232 0.09180138 0.081735 0.3211204 -0.0036552632 -0.047030564 -0.02311798 0.048961394 0.08669574 -0.06766279 -0.50028914 -0.048515294 0.14144728 -0.032994404 -0.11954345 -0.14929578 -0.2388355 -0.019883996 -0.15917352 -0.052084364 0.2801028 -0.0029121689 -0.054581646 -0.47385484 0.17112483 -0.12066923 -0.042173345 0.1395337 0.26115036 0.012869649 0.009291686 -0.0026459037 -0.075331464 0.017840583 -0.26869613 -0.21820338 -0.17084768 -0.1022808 -0.055290595 0.13513643 0.12362477 -0.10980586 0.13980341 -0.20233242 0.08813751 0.3849736 -0.10653763 -0.06199595 0.028849555 0.03230154 0.023856193 0.069950655 0.19310954 -0.077677034 -0.144811
Since I can't comment, writing another answer.
If anyone's having trouble using the above vector given by #jayelm because copy pasting won't work. I am writing 2 lines of code that will give you the vector ready to be used in python.
vec_string = '0.22418134 -0.28881392 0.13854356 0.00365387 -0.12870757 0.10243822 0.061626635 0.07318011 -0.061350107 -1.3477012 0.42037755 -0.063593924 -0.09683349 0.18086134 0.23704372 0.014126852 0.170096 -1.1491593 0.31497982 0.06622181 0.024687296 0.076693475 0.13851812 0.021302193 -0.06640582 -0.010336159 0.13523154 -0.042144544 -0.11938788 0.006948221 0.13333307 -0.18276379 0.052385733 0.008943111 -0.23957317 0.08500333 -0.006894406 0.0015864656 0.063391194 0.19177166 -0.13113557 -0.11295479 -0.14276934 0.03413971 -0.034278486 -0.051366422 0.18891625 -0.16673574 -0.057783455 0.036823478 0.08078679 0.022949161 0.033298038 0.011784158 0.05643189 -0.042776518 0.011959623 0.011552498 -0.0007971594 0.11300405 -0.031369694 -0.0061559738 -0.009043574 -0.415336 -0.18870236 0.13708843 0.005911723 -0.113035575 -0.030096142 -0.23908928 -0.05354085 -0.044904727 -0.20228513 0.0065645403 -0.09578946 -0.07391877 -0.06487607 0.111740574 -0.048649278 -0.16565254 -0.052037314 -0.078968436 0.13684988 0.0757494 -0.006275573 0.28693774 0.52017444 -0.0877165 -0.33010918 -0.1359622 0.114895485 -0.09744406 0.06269521 0.12118575 -0.08026362 0.35256687 -0.060017522 -0.04889904 -0.06828978 0.088740796 0.003964443 -0.0766291 0.1263925 0.07809314 -0.023164088 -0.5680669 -0.037892066 -0.1350967 -0.11351585 -0.111434504 -0.0905027 0.25174105 -0.14841858 0.034635577 -0.07334565 0.06320108 -0.038343467 -0.05413284 0.042197507 -0.090380974 -0.070528865 -0.009174437 0.009069661 0.1405178 0.02958134 -0.036431845 -0.08625681 0.042951006 0.08230793 0.0903314 -0.12279937 -0.013899368 0.048119213 0.08678239 -0.14450377 -0.04424887 0.018319942 0.015026873 -0.100526 0.06021201 0.74059093 -0.0016333034 -0.24960588 -0.023739101 0.016396184 0.11928964 0.13950661 -0.031624354 -0.01645025 0.14079992 -0.0002824564 -0.08052984 -0.0021310581 -0.025350995 0.086938225 0.14308536 0.17146006 -0.13943303 0.048792403 0.09274929 -0.053167373 0.031103406 0.012354865 0.21057427 0.32618305 0.18015954 -0.15881181 0.15322933 -0.22558987 -0.04200665 0.0084689725 0.038156632 0.15188617 0.13274793 0.113756925 -0.095273495 -0.049490947 -0.10265804 -0.27064866 -0.034567792 -0.018810693 -0.0010360252 0.10340131 0.13883452 0.21131058 -0.01981019 0.1833468 -0.10751636 -0.03128868 0.02518242 0.23232952 0.042052146 0.11731903 -0.15506615 0.0063580726 -0.15429358 0.1511722 0.12745973 0.2576985 -0.25486213 -0.0709463 0.17983761 0.054027 -0.09884228 -0.24595179 -0.093028545 -0.028203879 0.094398156 0.09233813 0.029291354 0.13110267 0.15682974 -0.016919162 0.23927948 -0.1343307 -0.22422817 0.14634751 -0.064993896 0.4703685 -0.027190214 0.06224946 -0.091360025 0.21490277 -0.19562101 -0.10032754 -0.09056772 -0.06203493 -0.18876675 -0.10963594 -0.27734384 0.12616494 -0.02217992 -0.16058226 -0.080475815 0.026953284 0.110732645 0.014894041 0.09416802 0.14299914 -0.1594008 -0.066080004 -0.007995227 -0.11668856 -0.13081996 -0.09237365 0.14741232 0.09180138 0.081735 0.3211204 -0.0036552632 -0.047030564 -0.02311798 0.048961394 0.08669574 -0.06766279 -0.50028914 -0.048515294 0.14144728 -0.032994404 -0.11954345 -0.14929578 -0.2388355 -0.019883996 -0.15917352 -0.052084364 0.2801028 -0.0029121689 -0.054581646 -0.47385484 0.17112483 -0.12066923 -0.042173345 0.1395337 0.26115036 0.012869649 0.009291686 -0.0026459037 -0.075331464 0.017840583 -0.26869613 -0.21820338 -0.17084768 -0.1022808 -0.055290595 0.13513643 0.12362477 -0.10980586 0.13980341 -0.20233242 0.08813751 0.3849736 -0.10653763 -0.06199595 0.028849555 0.03230154 0.023856193 0.069950655 0.19310954 -0.077677034 -0.144811'
import numpy as np
average_glove_vector = np.array(vec_string.split(" "))
print(average_glove_vector)

What is "unk" in the pretrained GloVe vector files (e.g. glove.6B.50d.txt)?

I found "unk" token in the glove vector file glove.6B.50d.txt downloaded from https://nlp.stanford.edu/projects/glove/. Its value is as follows:
unk -0.79149 0.86617 0.11998 0.00092287 0.2776 -0.49185 0.50195 0.00060792 -0.25845 0.17865 0.2535 0.76572 0.50664 0.4025 -0.0021388 -0.28397 -0.50324 0.30449 0.51779 0.01509 -0.35031 -1.1278 0.33253 -0.3525 0.041326 1.0863 0.03391 0.33564 0.49745 -0.070131 -1.2192 -0.48512 -0.038512 -0.13554 -0.1638 0.52321 -0.31318 -0.1655 0.11909 -0.15115 -0.15621 -0.62655 -0.62336 -0.4215 0.41873 -0.92472 1.1049 -0.29996 -0.0063003 0.3954
Is it a token to be used for unknown words or is it some kind of abbreviation?
The unk token in the pretrained GloVe files is not an unknown token!
See this google groups thread where Jeffrey Pennington (GloVe author) writes:
The pre-trained vectors do not have an unknown token, and currently the code just ignores out-of-vocabulary words when producing the co-occurrence counts.
It's an embedding learned like any other on occurrences of "unk" in the corpus (which appears to happen occasionally!)
Instead, Pennington suggests (in the same post):
...I've found that just taking an average of all or a subset of the word vectors produces a good unknown vector.
You can do that with the following code (should work with any pretrained GloVe file):
import numpy as np
GLOVE_FILE = 'glove.6B.50d.txt'
# Get number of vectors and hidden dim
with open(GLOVE_FILE, 'r') as f:
for i, line in enumerate(f):
pass
n_vec = i + 1
hidden_dim = len(line.split(' ')) - 1
vecs = np.zeros((n_vec, hidden_dim), dtype=np.float32)
with open(GLOVE_FILE, 'r') as f:
for i, line in enumerate(f):
vecs[i] = np.array([float(n) for n in line.split(' ')[1:]], dtype=np.float32)
average_vec = np.mean(vecs, axis=0)
print(average_vec)
For glove.6B.50d.txt this gives:
[-0.12920076 -0.28866628 -0.01224866 -0.05676644 -0.20210965 -0.08389011
0.33359843 0.16045167 0.03867431 0.17833012 0.04696583 -0.00285802
0.29099807 0.04613704 -0.20923874 -0.06613114 -0.06822549 0.07665912
0.3134014 0.17848536 -0.1225775 -0.09916984 -0.07495987 0.06413227
0.14441176 0.60894334 0.17463093 0.05335403 -0.01273871 0.03474107
-0.8123879 -0.04688699 0.20193407 0.2031118 -0.03935686 0.06967544
-0.01553638 -0.03405238 -0.06528071 0.12250231 0.13991883 -0.17446303
-0.08011883 0.0849521 -0.01041659 -0.13705009 0.20127155 0.10069408
0.00653003 0.01685157]
And because it is fairly compute intensive to do this with the larger glove files, I went ahead and computed the vector for glove.840B.300d.txt for you:
0.22418134 -0.28881392 0.13854356 0.00365387 -0.12870757 0.10243822 0.061626635 0.07318011 -0.061350107 -1.3477012 0.42037755 -0.063593924 -0.09683349 0.18086134 0.23704372 0.014126852 0.170096 -1.1491593 0.31497982 0.06622181 0.024687296 0.076693475 0.13851812 0.021302193 -0.06640582 -0.010336159 0.13523154 -0.042144544 -0.11938788 0.006948221 0.13333307 -0.18276379 0.052385733 0.008943111 -0.23957317 0.08500333 -0.006894406 0.0015864656 0.063391194 0.19177166 -0.13113557 -0.11295479 -0.14276934 0.03413971 -0.034278486 -0.051366422 0.18891625 -0.16673574 -0.057783455 0.036823478 0.08078679 0.022949161 0.033298038 0.011784158 0.05643189 -0.042776518 0.011959623 0.011552498 -0.0007971594 0.11300405 -0.031369694 -0.0061559738 -0.009043574 -0.415336 -0.18870236 0.13708843 0.005911723 -0.113035575 -0.030096142 -0.23908928 -0.05354085 -0.044904727 -0.20228513 0.0065645403 -0.09578946 -0.07391877 -0.06487607 0.111740574 -0.048649278 -0.16565254 -0.052037314 -0.078968436 0.13684988 0.0757494 -0.006275573 0.28693774 0.52017444 -0.0877165 -0.33010918 -0.1359622 0.114895485 -0.09744406 0.06269521 0.12118575 -0.08026362 0.35256687 -0.060017522 -0.04889904 -0.06828978 0.088740796 0.003964443 -0.0766291 0.1263925 0.07809314 -0.023164088 -0.5680669 -0.037892066 -0.1350967 -0.11351585 -0.111434504 -0.0905027 0.25174105 -0.14841858 0.034635577 -0.07334565 0.06320108 -0.038343467 -0.05413284 0.042197507 -0.090380974 -0.070528865 -0.009174437 0.009069661 0.1405178 0.02958134 -0.036431845 -0.08625681 0.042951006 0.08230793 0.0903314 -0.12279937 -0.013899368 0.048119213 0.08678239 -0.14450377 -0.04424887 0.018319942 0.015026873 -0.100526 0.06021201 0.74059093 -0.0016333034 -0.24960588 -0.023739101 0.016396184 0.11928964 0.13950661 -0.031624354 -0.01645025 0.14079992 -0.0002824564 -0.08052984 -0.0021310581 -0.025350995 0.086938225 0.14308536 0.17146006 -0.13943303 0.048792403 0.09274929 -0.053167373 0.031103406 0.012354865 0.21057427 0.32618305 0.18015954 -0.15881181 0.15322933 -0.22558987 -0.04200665 0.0084689725 0.038156632 0.15188617 0.13274793 0.113756925 -0.095273495 -0.049490947 -0.10265804 -0.27064866 -0.034567792 -0.018810693 -0.0010360252 0.10340131 0.13883452 0.21131058 -0.01981019 0.1833468 -0.10751636 -0.03128868 0.02518242 0.23232952 0.042052146 0.11731903 -0.15506615 0.0063580726 -0.15429358 0.1511722 0.12745973 0.2576985 -0.25486213 -0.0709463 0.17983761 0.054027 -0.09884228 -0.24595179 -0.093028545 -0.028203879 0.094398156 0.09233813 0.029291354 0.13110267 0.15682974 -0.016919162 0.23927948 -0.1343307 -0.22422817 0.14634751 -0.064993896 0.4703685 -0.027190214 0.06224946 -0.091360025 0.21490277 -0.19562101 -0.10032754 -0.09056772 -0.06203493 -0.18876675 -0.10963594 -0.27734384 0.12616494 -0.02217992 -0.16058226 -0.080475815 0.026953284 0.110732645 0.014894041 0.09416802 0.14299914 -0.1594008 -0.066080004 -0.007995227 -0.11668856 -0.13081996 -0.09237365 0.14741232 0.09180138 0.081735 0.3211204 -0.0036552632 -0.047030564 -0.02311798 0.048961394 0.08669574 -0.06766279 -0.50028914 -0.048515294 0.14144728 -0.032994404 -0.11954345 -0.14929578 -0.2388355 -0.019883996 -0.15917352 -0.052084364 0.2801028 -0.0029121689 -0.054581646 -0.47385484 0.17112483 -0.12066923 -0.042173345 0.1395337 0.26115036 0.012869649 0.009291686 -0.0026459037 -0.075331464 0.017840583 -0.26869613 -0.21820338 -0.17084768 -0.1022808 -0.055290595 0.13513643 0.12362477 -0.10980586 0.13980341 -0.20233242 0.08813751 0.3849736 -0.10653763 -0.06199595 0.028849555 0.03230154 0.023856193 0.069950655 0.19310954 -0.077677034 -0.144811
Since I can't comment, writing another answer.
If anyone's having trouble using the above vector given by #jayelm because copy pasting won't work. I am writing 2 lines of code that will give you the vector ready to be used in python.
vec_string = '0.22418134 -0.28881392 0.13854356 0.00365387 -0.12870757 0.10243822 0.061626635 0.07318011 -0.061350107 -1.3477012 0.42037755 -0.063593924 -0.09683349 0.18086134 0.23704372 0.014126852 0.170096 -1.1491593 0.31497982 0.06622181 0.024687296 0.076693475 0.13851812 0.021302193 -0.06640582 -0.010336159 0.13523154 -0.042144544 -0.11938788 0.006948221 0.13333307 -0.18276379 0.052385733 0.008943111 -0.23957317 0.08500333 -0.006894406 0.0015864656 0.063391194 0.19177166 -0.13113557 -0.11295479 -0.14276934 0.03413971 -0.034278486 -0.051366422 0.18891625 -0.16673574 -0.057783455 0.036823478 0.08078679 0.022949161 0.033298038 0.011784158 0.05643189 -0.042776518 0.011959623 0.011552498 -0.0007971594 0.11300405 -0.031369694 -0.0061559738 -0.009043574 -0.415336 -0.18870236 0.13708843 0.005911723 -0.113035575 -0.030096142 -0.23908928 -0.05354085 -0.044904727 -0.20228513 0.0065645403 -0.09578946 -0.07391877 -0.06487607 0.111740574 -0.048649278 -0.16565254 -0.052037314 -0.078968436 0.13684988 0.0757494 -0.006275573 0.28693774 0.52017444 -0.0877165 -0.33010918 -0.1359622 0.114895485 -0.09744406 0.06269521 0.12118575 -0.08026362 0.35256687 -0.060017522 -0.04889904 -0.06828978 0.088740796 0.003964443 -0.0766291 0.1263925 0.07809314 -0.023164088 -0.5680669 -0.037892066 -0.1350967 -0.11351585 -0.111434504 -0.0905027 0.25174105 -0.14841858 0.034635577 -0.07334565 0.06320108 -0.038343467 -0.05413284 0.042197507 -0.090380974 -0.070528865 -0.009174437 0.009069661 0.1405178 0.02958134 -0.036431845 -0.08625681 0.042951006 0.08230793 0.0903314 -0.12279937 -0.013899368 0.048119213 0.08678239 -0.14450377 -0.04424887 0.018319942 0.015026873 -0.100526 0.06021201 0.74059093 -0.0016333034 -0.24960588 -0.023739101 0.016396184 0.11928964 0.13950661 -0.031624354 -0.01645025 0.14079992 -0.0002824564 -0.08052984 -0.0021310581 -0.025350995 0.086938225 0.14308536 0.17146006 -0.13943303 0.048792403 0.09274929 -0.053167373 0.031103406 0.012354865 0.21057427 0.32618305 0.18015954 -0.15881181 0.15322933 -0.22558987 -0.04200665 0.0084689725 0.038156632 0.15188617 0.13274793 0.113756925 -0.095273495 -0.049490947 -0.10265804 -0.27064866 -0.034567792 -0.018810693 -0.0010360252 0.10340131 0.13883452 0.21131058 -0.01981019 0.1833468 -0.10751636 -0.03128868 0.02518242 0.23232952 0.042052146 0.11731903 -0.15506615 0.0063580726 -0.15429358 0.1511722 0.12745973 0.2576985 -0.25486213 -0.0709463 0.17983761 0.054027 -0.09884228 -0.24595179 -0.093028545 -0.028203879 0.094398156 0.09233813 0.029291354 0.13110267 0.15682974 -0.016919162 0.23927948 -0.1343307 -0.22422817 0.14634751 -0.064993896 0.4703685 -0.027190214 0.06224946 -0.091360025 0.21490277 -0.19562101 -0.10032754 -0.09056772 -0.06203493 -0.18876675 -0.10963594 -0.27734384 0.12616494 -0.02217992 -0.16058226 -0.080475815 0.026953284 0.110732645 0.014894041 0.09416802 0.14299914 -0.1594008 -0.066080004 -0.007995227 -0.11668856 -0.13081996 -0.09237365 0.14741232 0.09180138 0.081735 0.3211204 -0.0036552632 -0.047030564 -0.02311798 0.048961394 0.08669574 -0.06766279 -0.50028914 -0.048515294 0.14144728 -0.032994404 -0.11954345 -0.14929578 -0.2388355 -0.019883996 -0.15917352 -0.052084364 0.2801028 -0.0029121689 -0.054581646 -0.47385484 0.17112483 -0.12066923 -0.042173345 0.1395337 0.26115036 0.012869649 0.009291686 -0.0026459037 -0.075331464 0.017840583 -0.26869613 -0.21820338 -0.17084768 -0.1022808 -0.055290595 0.13513643 0.12362477 -0.10980586 0.13980341 -0.20233242 0.08813751 0.3849736 -0.10653763 -0.06199595 0.028849555 0.03230154 0.023856193 0.069950655 0.19310954 -0.077677034 -0.144811'
import numpy as np
average_glove_vector = np.array(vec_string.split(" "))
print(average_glove_vector)

MATLAB - Smart way of reordering the data to show the bell shape?

I have a vector of 810 doubles, which have a very good reason (not relevant, so reason left out here) to roughly follow a Gaussian distribution. I now wish to verify this.
-1425.35483258195
-2005.85887636008
1560.31387221920
422.221432204087
-396.336462872091
1028.04845220438
-126.818743685290
482.602336878657
-351.945829219904
-408.071209112604
-251.839429417447
-1325.82167938863
-1304.65143253464
984.905373772623
-866.213152797951
73.6149979242073
3834.52647065066
917.976216226379
815.189065312880
-96.0747513429396
-319.662630897599
-1221.53710722367
1190.72857085035
2144.87935230603
143.558912788403
-167.475218992091
84.4066585851642
604.944484054070
1509.18911810685
-587.472369780628
143.669853748808
-1412.71275982249
46.2128162171030
-141.952303144073
292.716350945286
-1952.97174976434
-391.978769841029
-573.922085615045
-1301.25783316103
-645.990154917124
-934.774832747395
116.250933690360
544.823087571245
160.470631259077
2602.37582436213
534.434360410673
989.338067269714
-447.272873139365
1118.95219395721
-898.345257943601
-681.176900874247
213.211488587428
-785.169609773158
-430.621935639622
-61.2013695948126
-377.605278281317
1497.99884427522
404.777865468637
-1504.27516707823
-789.565967187437
-1118.50469245357
665.045393010892
28.5870825819056
-1045.44580718356
409.712490593987
-772.727898866153
252.289204820367
-905.257023890462
-300.707805837961
-74.4828651478892
-1087.13018005821
275.455585388661
-947.372403119386
-348.098011249108
1645.59654581296
517.906791768635
322.688565166583
331.939567925477
-612.937714892369
1302.63905136005
-153.606350780020
962.124237023736
-627.507634881584
-226.761116511015
463.132204629023
-939.535480854969
1532.46771710064
222.343444164121
303.799291944722
1084.75952392675
856.849727349139
514.671849059982
-1655.41677124587
-127.246214481867
1734.57538636725
56.2555496894129
297.184996955808
386.823038887674
1353.32907458661
-2307.42128989437
654.140638203676
-1392.55190147038
-607.378457188426
2384.44806882885
2293.43321702051
47.3551805244424
746.478352537506
-938.994569192230
192.721071163348
-4.29989698278951
100.289121179456
812.825042468773
103.758682367813
-490.446548621378
-1339.81881565035
-595.379417250646
499.458821146142
765.412270843621
532.095358987890
-247.000541594816
-480.064780362965
316.263039414615
449.159830642379
1933.73501675200
52.4933985187945
3107.07754310984
218.560847122495
-575.905775870265
2124.73813469493
1272.05488997025
-777.848735713576
-1981.23363300488
-1677.42292095700
605.970193322080
-759.457775207884
-84.6880200736514
-825.486129072864
739.859927410010
781.960329365344
-77.4149957958771
57.2191392940895
-500.299306284041
-575.933520504403
-1402.37563293800
-164.380079314064
232.496199533151
-1150.91949913415
-76.7244722653677
-47.1854688325975
3140.15003755460
-1330.34731058125
145.959328309073
1600.84592274169
-24.9232437067194
1395.72597116652
1621.21780908032
85.7934267467826
3279.93389739169
1516.36196103567
-1039.08123306694
-26.8801938594925
1492.40185616989
2445.00749952465
356.999233644735
962.558708481810
52.1166636462876
-180.338443931614
1135.86865368933
-823.056149900716
513.081279614810
1632.55462408653
700.098711621567
-1004.01052033257
838.798263982723
49.8997525149898
55.7119287736441
70.2170006303140
1694.63693249428
340.232240670027
-785.266479583255
-1490.75705865048
907.610252998604
-338.053785111386
-158.302237129483
810.050487377534
357.250435108372
-963.243345806145
-514.816899574995
-110.692584354545
-523.919862464934
1179.10220528566
-707.786358763919
-744.020374418825
-184.772566675096
1048.29906053004
291.212627519886
-1699.41811637976
1169.44527009486
-981.678801748923
-1003.40713607552
380.192537521396
-246.707457668937
550.802388031726
-537.288326878588
-1909.14320854900
334.084003654602
1315.91990144012
-87.7648557606226
484.711537732028
473.487768459009
-348.920356370175
1290.48676735822
128.895752577464
-463.231949345272
-461.975905174972
-1960.31200780286
-463.066933400454
636.123377337945
-457.893970268988
-886.605181442714
1288.19479205883
1534.45512372237
358.985063025987
-528.812332235894
-853.054006933991
-470.516697933259
75.7814123333455
922.488730629832
-607.322778020469
-1168.32280143675
-1164.49270789924
1144.46795817114
237.919069158304
464.605144587411
-1779.15853404145
-687.068044606544
-1558.76542615201
1122.67865170534
147.377599700542
-1941.99704382516
-197.265704116704
771.418343613877
617.611707163922
-184.024688145049
-630.977365935732
-1449.60039078064
-2031.87254125474
339.115574107012
507.110373028804
-35.3207932025325
-226.486793234861
-1151.56599554854
-1626.99321244231
137.738574478250
753.742389575134
254.480213830903
-299.494668002752
-464.217408373797
-496.881983902247
124.175797483519
641.036273039545
-875.645225304455
-850.013624134959
-1109.89290174708
-1564.11183301231
-697.565976576565
1033.07113738597
51.6226154312835
771.394497917418
942.389088995320
1196.05347790507
-678.014674370847
-1166.17784824523
-999.823651819537
401.157806208508
-477.448061210651
-73.4660915645727
354.733942551547
-431.951507899472
1318.88051933731
344.338765880639
598.804769543009
-1248.79480727276
858.421109550656
-556.899467429002
1807.63182070400
-159.136636488214
1799.22265828335
-240.692257347021
-1220.72830688348
74.1221061405495
-1640.11048850379
572.382227723556
94.5360732832287
-605.392031424351
-1493.66630795796
2797.18192240374
1616.53036000672
572.718159305989
-2204.37593002237
-322.774296955364
-430.000508898373
-104.048115418703
-1087.07923450010
-144.392429036394
-32.6087990828473
814.526216253877
-942.330594442485
561.790955470880
-919.551937169830
-637.665692814220
-27.5056243422514
-53.4289661228959
-349.487021035242
1254.06511966928
-1084.45861311495
2229.81290397501
-61.7348956711430
2315.31893617308
-905.051392195664
-330.066447009382
667.731590154659
2362.88627865678
-134.122103628698
-998.406485815768
3412.50099946158
1738.40741493709
-1435.63471165575
279.643055725084
-1080.07001067794
1038.67436238825
-59.5107168975392
-428.623061940542
90.9674755392625
428.864025012534
461.053414725420
76.7429752580429
-1047.74156835217
-1138.74437265796
-274.421370192137
-123.113141820210
1062.88624964445
-31.2932925583336
-47.2386460915968
566.285925381789
-162.476912673828
-579.149923138140
-1645.39762646152
-0.858506362164007
-1388.80667204739
-1611.24592897178
791.558877147674
-866.957703178365
-670.383595544925
-274.315832321348
832.836094551398
167.966896050509
-795.723924812035
-463.256974967680
1774.72662476188
565.080479736877
-567.419721200751
-1293.18551532982
-102.106516298420
-1023.45366146646
342.686949749701
-1943.95971161213
-1242.88726281114
-363.843860974623
218.911330607643
-1299.25729831318
1801.62607745420
485.598606188077
-209.745413876955
824.225434162006
80.4120792716949
-1101.71843965055
-489.648697328510
515.844053843147
921.482043018526
-1007.19911926667
892.329352367407
-206.559385161153
1289.34370927758
132.875816484038
-693.868519040030
1238.26425299907
-1029.59553086535
-1045.03807191815
684.486795552311
1156.78869170002
503.287695641337
-1761.08210583943
1504.49354570708
-2.38904808358893
-891.872139762875
-471.810563237261
-357.989880589008
1237.27821337192
-1118.74885097858
-1664.34130366938
-707.692351910793
-179.392901121722
2.51492953811066
1283.18737455847
-621.389601400035
-142.879999439187
409.670524550240
-1520.69976017121
-945.049848553859
-508.792922740967
-1013.16358819067
1333.23041683582
-485.612620083036
633.086691051391
974.147758083274
-1482.92064521016
170.457510166752
-319.276155374235
-102.475737807179
-806.147249075137
12.2710130392989
-1284.51556919452
-898.153118894255
-1264.39247941500
310.518406626345
300.653823915844
771.414870652903
-443.317509228219
-217.820334710797
-568.546664318710
1221.84294901066
406.621006399999
1536.40526591371
-88.1133675293550
-423.424777451501
911.889602982767
-561.496263588238
271.138351172729
-853.973387287833
-919.890778546460
29.2848487526380
515.153053644342
916.906990841340
-988.795430988675
-2288.39552282448
-354.278819283692
727.755387319324
-438.729779965080
-460.743800289783
498.419836651718
-1016.52550793176
1719.96686296631
1378.78460794700
-74.1662909720290
-794.313252330921
-579.575585572261
1032.38039003017
-139.352044190169
-921.411154727281
-632.436686005542
511.937665864214
-465.867561162771
682.753565100946
-837.343300398131
-203.807199334120
697.210075868778
383.665875798750
-285.949563188691
473.458597852185
-1074.35861318824
-1397.97178609897
-1003.88094273428
711.645485233387
980.242460658460
-528.721307121879
105.957921554011
-688.634132580983
-1639.86682075114
-1477.62683757162
462.025696610903
398.670153266179
-553.384968966910
625.057323419982
1002.74945530885
493.980588087071
738.959303162354
-1262.67475377973
-572.177811667752
308.236783523600
-406.525047730467
-279.064813326399
-958.715002126351
667.070145627578
101.287176877154
1029.77737241096
884.572701132805
-83.2129981675607
436.454402083900
-731.784460332883
-828.152649525027
-4.59039366284560
1173.27675925715
947.875960433114
219.825903925695
-1348.62640205386
1416.03397544099
-867.559504844558
-921.730092961659
-594.959474993817
-578.601607691280
1713.94162285846
-890.538205788453
-31.1520768075825
-840.083103895247
271.138757797886
30.4852851688702
1077.87779754491
1027.97848743209
-1216.23657331206
718.721023399901
1168.72555297803
-1126.29889111422
-38.0485352514261
559.324906271146
143.737128536855
-520.823282690490
-421.404810071988
-879.950358622646
383.976795499795
431.651954748570
-1014.23132627610
791.227060578848
-446.251570984076
-880.263297275125
249.601560953378
598.668996636698
-850.966552797904
-755.673152069518
572.459612333701
177.692884397484
600.926582295822
1188.77012356426
-691.839959588951
-502.690126011454
25.3399035825159
-520.840003827040
-1012.71789390801
389.607229075335
2336.37944482949
4601.28549891861
-306.109062155709
-549.754157876436
1366.49195319864
80.1658288538984
-371.898746055809
221.220689352235
-1034.26423624067
-227.175386588575
456.976283321366
-893.331715773963
1139.45199359176
-426.972252590305
125.689745639213
-755.291604557464
906.715046470356
-512.375090745554
-183.436523634227
-1271.17935524883
343.944597498092
357.375062083205
503.360088312412
-1063.47172123093
-544.151922949868
-811.465116940168
-382.518818090521
-556.364297944514
28.4462104060804
-391.469958695141
702.920945890725
3276.75393467650
-1601.46189476513
215.094152272832
1289.65287462539
-266.279004487831
-359.857620157932
770.042919144449
984.724775247226
-124.345016575689
-479.520031553426
-477.614657604774
-88.1973812779706
112.358632460992
-1213.83458979284
-2164.47382612438
68.3193891329520
-881.771368628023
169.849935188382
988.964748477654
158.420971886695
199.625846049925
-532.329781433098
-35.8791437174177
1327.24414199818
562.056024879379
-476.849700670497
221.935429478316
1563.51220970406
-799.111728532053
451.956789308570
-271.333350632898
637.294349346712
1842.67665511535
1462.24940915688
632.600524370893
-400.567297177738
-423.141178019932
-599.578935046919
1631.33084732187
1274.89043581579
417.304963875531
-1671.85306193716
-875.134111594071
-888.175742491925
781.420136563384
-267.891345155953
-984.747911033292
619.692042477853
507.097508348636
-2191.61648965449
871.941679622509
354.805731581013
-813.842611493990
-797.177241995762
-963.636395691701
69.1304461321361
-504.047776578296
2120.31188847033
-972.664353788443
1635.59419926198
813.341563425280
556.176646947061
853.969583042162
-340.944807273296
1860.51180529648
164.584505357234
2416.65258964103
453.392895539603
-598.129522562993
-436.423805800641
-1073.09763119235
818.515189790920
915.328394918082
-844.856562790346
-151.573550264624
188.640708868880
662.302131098894
599.417211331797
625.005959399488
-712.457731270164
-456.930434881017
430.006802672571
432.201988090676
-1805.76017728314
-765.777546844394
-366.775473893042
-1390.36840812369
-115.821548189160
-348.024152316248
-2081.96568815380
-1270.09386919888
590.375306666686
-1425.53765935045
3082.54680185326
127.134922589132
-64.8286220730133
1030.02663234355
814.903512139286
209.749857774112
-195.505605537004
-627.329338981148
-1312.39013357494
171.133025716999
1247.87792775840
-519.610915373372
2112.46320371606
9.32989625287883
165.878608899171
-1271.10665412487
-993.973494489265
2.05541319826716
-659.015889529154
1013.38348492602
719.325793086795
-533.063499513220
-120.770138644812
582.042029894734
-2453.18073103871
-480.093187602034
-947.472900692303
661.481266970943
2091.75056688107
264.617458558831
688.572625188444
-553.727400183363
1854.09100161690
-584.441964107862
160.859971291589
-721.797012172758
530.747141338406
-440.327138191801
110.306408202837
-1383.45033522105
-2165.65821622627
771.596558585712
168.287995174270
873.750929372014
-899.896768600454
826.404866814517
-954.055096167494
24.5476499542792
-1035.13149902373
242.010718795495
817.365138698160
-191.375310710834
1026.53362477036
1506.65413148945
-527.047853871006
-301.653867464794
-906.675209444651
-1.35693695415011
-1361.35311995217
-1055.16027830381
1835.99350558714
-153.771963586896
-1024.20571742302
-7.52437658966574
-199.706148737951
-55.6970184322190
358.588943940760
750.748910025249
-130.369658412103
115.801978429215
-362.228245710698
1072.90727368639
285.144147030216
275.476205540914
-425.410694245723
-358.251651764280
54.5762606959333
-986.393501781955
1987.82942003996
-119.612618624157
4088.87885652650
-46.4155713632708
855.384197547839
348.442497861890
-638.248417658654
1464.23877404674
1958.52428386456
102.248039598800
-111.573563229895
-485.111192525760
-766.212536691707
-2000.03833099073
-1256.32241772728
-42.4379799861126
664.131166310975
-801.923373300931
-983.323933269528
1648.18557989817
387.963306334943
-1569.93080059318
222.937758757887
-114.704995616332
-390.835557576743
-583.648071221651
898.816929040312
-1430.89073677080
485.974041322305
-546.204265548018
-928.957012051962
857.876890609405
275.037960659607
-984.994405754642
-2584.59906037955
-1238.24519741587
2252.11593175793
-615.096425393547
-1441.41851657964
1852.82338381085
-794.909638822273
-1724.84781684101
-837.605831849087
242.111467511679
-1348.52499237212
1575.34223230675
-1400.44850343094
2640.09986069654
-1216.91897365897
-861.181602306838
264.540049874899
722.940025618886
394.150610960769
-918.913133715832
-448.520560840232
169.730666854386
-553.694315285847
-389.285749031319
1917.29156813389
Since the vector is not ordered, directly plotting them in current order gives a messy plot like this.
What is a smart way of easily reordering the data so that I can verify whether they roughly follow a normal distribution or not?
To see if your data follow a Gaussian distribution you can use normplot. It draws the empirical distribution function of the data on a scale such that a Gaussian corresponds to a straight line. The more straight the line is, the more similar to a Gaussian.
If often happens that the curve is straight in the central part, and deviates on its ends. That's the case with your data; see figure below.
The explanation is that the Gaussian distribution usually arises from a sum of random variables, by virtue of the Central Limit Theorem. The distribution of the sum is Gaussian in the limit as the number of summands tends to infinity. For a finite number of terms the distribution is only approximately Gaussian; and the approximation is better at the center than on the tails (in fact, that's why the theorem is called central).
If you have the Statistics Toolbox, you could also run one of the hypothesis tests, like the Jarque-Bera Test, which will tell you (within a confidence limit) how likely it is that your points were drawn from a normal distribution.
[h, p] = jbtest(X, 0.05);
if h
disp(sprintf('There is less than %.3f%% chance that your data is abnormal', 100 * p));
else
disp(sprintf('Your data is not normal within your confidence limit'));
end
This doesn't give you a pretty plot, but it will give you a less biased answer than just eyeballing it.
In this case, there is better than 99.9% chance that your data set was drawn from a normal distribution.
Simply use the histogram function:
histogram(X)
To view the PDF you can use:
if exist('histogram','func')
histogram(data)
else
hist(data)
end
This looks pretty normal to me. You can always try modeling it with a gaussian.