(Mild Humor) Notes by an OCD Data Scientist about Christmas

Christmas is an odd but challenging of time of the year, especially for the nerdiest of us.  The number of problems to solve are never ending.  There isn’t enough computational power, even with using multiple Apache Sparks to draw the correct solution. Here are some notes that I have on those challenges.

 

Gifting

  1. The quantity and price of gifts to family members vary inversely with the square distance from the clustered family center (me).  The one-over-r-squared ratio fits well with other physics field function calculations (sound, light, gravity…etc) 
  2. It is assumed that, all gifts to/from family members should fit within the +2/-0.5 standard deviations of the median of the cost of all gifts received.
  3. Wife received and iris plant (which she loves) and then had me identified it via the sepal length and width.
  4. Popularity:  For an acceptable gift, stay within +3/-0.5 SD in the desired popular variable area.  Note:  This doesn’t guarantee acceptability by the person on the receiving end of the gift.  “Popularity does not imply Acceptability” just like “Correlation does not imply Causation”
  5. Text analysis of the words spoken at dinner, even with non-linear weights applied, doesn’t always help with gift choices.  (Note to self:  Check those weighting factors again and watch those n-grams filters less you remove some important information!)

 

Shopping

People shopping in a crowded mall; perform data analysis to determine the best time to go shopping with the least amount of people as implied by the number of cars present in the parking lot or the number of people passing a specific spot within the mall (foot traffic).

  1. Foot traffic median increases over time with a diurnal variation of 7% as time approaches 0 [12/25 at 0800 am]. 
    During the week, data samples of traffic volume appears to be time dependent with two different variations.
  2.  Monday-Friday:  a double hump (bimodal) density curve [a Positive + a Negative skew] with a smaller peak around noon and a large peak after 7:47 pm and returning to zero when the store closes.
  3. Saturday-Sunday:  a wide platykuric Gaussian with no real peak but with the maximum number of people correlated to the number of parking spaces left.
  4. Pay attention to the foot traffic in front of each store in the mall.  The median foot traffic with the greatest standard deviation increase correlated with the sex of the child and the amount of exposure time to TV and internet advertising will help determine the perfect gift.

 

Christmas Tree/House decorations

  1. Christmas tree cutting:  as much as possible (with minimal tree cutting), the 2D outline resembles some perfect equilateral triangle
  2. The number of Christmas tree elements and placement; if taken as a 2D image from any 360 degree angle the placement of the ornaments are the same within p<=0.05.  This includes the gift spacing underneath the tree.
  3. The Christmas tree ornaments have a minuscule SD of the differences of the distances between the elements around them and are “perfectly” aligned in the X, Y and Z axis also.
  4. The Christmas lights on the house should follow the edges only. The lines of the lights should be tidy/taut with no drooping anywhere.  The lights should be white (full spectrum) lights to show inclusiveness.  You don’t want to be influenced by those energetic deep UV photons to remove those deep infrared photons from the mix just because they are low energy. That would be photonic-color purist (Racist towards photon energies).

 

One final note

I don’t see how Santa does it.    I still cannot predict that “Naughty/Nice” thing no matter how complex my adaptive predictive model (non-linear, of course), seems to be. PCA just doesn’t seem to work here either.  There just is no actionable analytics that can be deduced from this data.  There is just too much wideband noise and all my attempts at filtering have failed.

 

 Have happy and low variance Christmas.