diff --git a/2022-12-18-23-11-02.png b/2022-12-18-23-11-02.png
new file mode 100644
index 0000000000000000000000000000000000000000..c0a39e1203baf4966d6c44f95f1e7795cbbc44e5
Binary files /dev/null and b/2022-12-18-23-11-02.png differ
diff --git a/Task1 reflect notes-DELETE.md b/Task1 reflect notes-DELETE.md
deleted file mode 100644
index 8947e190013ebec5f101e3899c88db5938c56ce4..0000000000000000000000000000000000000000
--- a/Task1 reflect notes-DELETE.md	
+++ /dev/null
@@ -1,11 +0,0 @@
-# file to store 'development process report' notes for task 1
-
-* pleasure of the process; also frustrating
-* process - crisp-dm - iterative
-* related to work - > importance of requirements and flexing, agility
-* pseudo code - getting into habit has been very useful (have a tendency, which I need to fight, to dive into the deep end and just manage to keep afloat)
-* getting into git (hub, lab) - but have been frustrated (authentication with gitlab - needing to switch to https from ssh, despite seemingly set up correctly)
-* single powerful tool, work environment - vs code - revolutionary
-* documentation - robust, standardised, conventions - i.e. commenting
-* intersections of domain, maths, coding
-* not enough time
diff --git a/UFCFVQ-15-M_Programming_Task_1.ipynb b/UFCFVQ-15-M_Programming_Task_1.ipynb
deleted file mode 100644
index bf8261028bb7b6dc2d30c5edd625a6efc314b1c0..0000000000000000000000000000000000000000
--- a/UFCFVQ-15-M_Programming_Task_1.ipynb
+++ /dev/null
@@ -1,1237 +0,0 @@
-{
- "cells": [
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "deletable": false
-   },
-   "source": [
-    "# UFCFVQ-15-M Programming for Data Science (Autumn 2022)\n"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "deletable": false
-   },
-   "source": [
-    "<p style=\"color:red; font-weight:bold; font-size:xx-small\">OVERALL COURSEWORK MARK: ___%</p>"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "deletable": false
-   },
-   "source": [
-    "### GitLab link submission, README.md file and Git commit messages\n",
-    "<p style=\"color:red; font-weight:bold; font-size:xx-small\">MARK: __%</p>\n",
-    "<p style=\"color:red; font-weight:bold; font-size:xx-small\">FEEDBACK: </p>"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "deletable": false
-   },
-   "source": [
-    "# Programming Task 1\n",
-    "\n",
-    "## Student Id: 05976423"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "deletable": false
-   },
-   "source": [
-    "### Requirement FR1 - Develop a function to find the arithmetic mean"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "\n",
-    "\n",
-    "$$\n",
-    "{\\displaystyle A={\\frac {1}{n}}\\sum _{i=1}^{n}a_{i}={\\frac {a_{1}+a_{2}+\\cdots +a_{n}}{n}}}\n",
-    "$$\n",
-    "\n",
-    "##### <b>`Pseudocode / Plan:`</b> \n",
-    "\n",
-    "<b>`Input:`</b> list of values\n",
-    "\n",
-    "<b>`Steps:` </b>\n",
-    "\n",
-    "1. sum list values\n",
-    "2. divide sum of list values by count (len) of list of values\n",
-    "3. return result\n",
-    "4. add try except block\n",
-    "\n",
-    "<b>`Output:`</b>\n",
-    "\n",
-    "* arithmetic mean"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 4,
-   "metadata": {
-    "deletable": false
-   },
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "47.62"
-      ]
-     },
-     "execution_count": 4,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "# list of numbers\n",
-    "list = [85, 29, 35, 55, 82, 45, 42, 21, 42, 60, 56, 30, 72, 56, 37, 65, 29, 14, 66,  43, 23, 39, 81, 56, 74, 29, 22, 27, 14, 66, 55, 33, 31, 66, 63, 41, 30, 48, 68, 58, 51, 44, 66, 34, 20, 71, 59, 57, 43, 48]\n",
-    "\n",
-    "\n",
-    "\n",
-    "def mean(list):\n",
-    "    '''\n",
-    "    Function to calculate arithmetic mean - i.e. sum of data divided by number of data points.\n",
-    "    '''\n",
-    "    try:\n",
-    "        #print(sum(list) / len(list))\n",
-    "        return sum(list) / len(list)\n",
-    "    except ZeroDivisionError:\n",
-    "        print(\"Error: Division by zero. List is empty\")\n",
-    "    except TypeError:\n",
-    "        print(\"Error: Invalid typ in list. List must contain only numbers\")\n",
-    "    except:\n",
-    "        print(\"Error with list of numbers. Please check list\")\n",
-    "mean(list)\n"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "deletable": false
-   },
-   "source": [
-    "<p style=\"color:red; font-weight:bold; font-size:xx-small\">MARK: __%</p>\n",
-    "<p style=\"color:red; font-weight:bold; font-size:xx-small\">FEEDBACK: </p>"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "deletable": false
-   },
-   "source": [
-    "### Requirement FR2 - Develop a function to read a single column from a CSV file"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "\n",
-    "\n",
-    "##### <b>`Pseudocode / Plan:`</b> \n",
-    "\n",
-    "<b>`Input:`</b> filename/path\n",
-    "\n",
-    "<b>`Parameters:` </b>\n",
-    "\n",
-    "* file - name, path\n",
-    "* column to read -  column number (0 - n-1, where n is number of columns)\n",
-    "* delimiter - assume comma (,), but allow for other separator, e.g. /t (tab), | (pipe), etc.\n",
-    "* header - True / False, default = True\n",
-    "\n",
-    "<b>`Steps:` </b>\n",
-    "\n",
-    "1. Open file (with open, so autocloses file)\n",
-    "2. if / else for header status\n",
-    "3. extract column name as value\n",
-    "4. iterate over remaining lines with list comprehension to return list of values in column\n",
-    "5. return column name and values or just values\n",
-    "6. add try except block\n",
-    "\n",
-    "<b>`Output:`</b>\n",
-    "\n",
-    "* If header = True, return column name as value and list of data, else return only list of values\n"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 201,
-   "metadata": {
-    "deletable": false
-   },
-   "outputs": [],
-   "source": [
-    "\n",
-    "def data_from_column(filename, columnNum, delimiter = ',', header = True):\n",
-    "    '''\n",
-    "    Function to read csv file and return a list of data from a specified column (by column number 0 to n-1).  Default delimiter is comma.  If header is True, the function will return the column name and the data in a list.  If header is False, the function will return the data in a list.\n",
-    "    '''\n",
-    "    try:\n",
-    "        with open(filename) as openFile:\n",
-    "            if header == True:\n",
-    "                variable = next(openFile).split(delimiter)[columnNum]  \n",
-    "                data = [line.split(delimiter)[columnNum] for line in openFile]              \n",
-    "                return  variable, data\n",
-    "            else:\n",
-    "                return [line.split(delimiter)[columnNum] for line in openFile]\n",
-    "    except FileNotFoundError:\n",
-    "        print(\"Error: File not found. Please check file name, extension and path\")\n",
-    "    except:\n",
-    "        print(\"Error with file. Please check file\")\n",
-    "\n"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "# Return column names and data from csv file with header row\n",
-    "variable, data = data_from_column('task1.csv', 1, header = True)\n",
-    "print(variable)\n",
-    "print(data)"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "# Demo where returns only data (assuming file has no header row) - note task1.csv has a header row so 'pop' included in list\n",
-    "data = data_from_column('task1.csv', 1, ',', header = False)\n",
-    "print(data)\n"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "deletable": false
-   },
-   "source": [
-    "<p style=\"color:red; font-weight:bold; font-size:xx-small\">MARK: __%</p>\n",
-    "<p style=\"color:red; font-weight:bold; font-size:xx-small\">FEEDBACK: </p>"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "deletable": false
-   },
-   "source": [
-    "### Requirement FR3 - Develop a function to read CSV data from a file into memory"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "\n",
-    "##### <b>`Pseudocode:`</b> \n",
-    "\n",
-    "<b>`Input:`</b> filename/path\n",
-    "\n",
-    "<b>`Parameters:` </b>\n",
-    "\n",
-    "* file\n",
-    "* column to read, by column number (0 - n-1, where n is number of columns)\n",
-    "* delimiter - assume comma (,), but allow for other separator, e.g. /t (tab), | (pipe), etc.\n",
-    "* header - True / False, default = True\n",
-    "  \n",
-    "\n",
-    "<b>`Output:`</b>\n",
-    "\n",
-    "* If header = True, return column name as value and list of data, else return only list of values\n"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 205,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "def read_csv_into_dictionary(filename, delimiter = ','):\n",
-    "    '''\n",
-    "    Function to read csv file and return a dictionary with column names as keys and data as values. Default delimiter is comma.  Assumes that the first line of the file contains the column names (i.e. header).\n",
-    "    '''\n",
-    "    try:\n",
-    "        with open(filename) as openFile:\n",
-    "            variable = next(openFile).split(delimiter)\n",
-    "            data = [line.split(delimiter) for line in openFile]\n",
-    "            variableData_dict = {variable[i]: [row[i] for row in data] for i in range(len(variable))}\n",
-    "            return variableData_dict\n",
-    "\n",
-    "    except FileNotFoundError:\n",
-    "        print(\"Error: File not found. Please check file name, extension and path\")\n"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "read_csv_into_dictionary('task1.csv')"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "deletable": false
-   },
-   "source": [
-    "<p style=\"color:red; font-weight:bold; font-size:xx-small\">MARK: __%</p>\n",
-    "<p style=\"color:red; font-weight:bold; font-size:xx-small\">FEEDBACK: </p>"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "deletable": false
-   },
-   "source": [
-    "### Requirement FR4 - Develop a function to calculate the Pearson Correlation Coefficient for two named columns"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "##### `Pearson's R formulae`\n",
-    "\n",
-    "Given paired data, consisting of $n$ pairs: \n",
-    "$$\n",
-    "\n",
-    "\\displaystyle\n",
-    "r =\n",
-    "  \\frac{ \\sum_{i=1}^{n}(x_i-\\bar{x})(y_i-\\bar{y}) }{%\n",
-    "        \\sqrt{\\sum_{i=1}^{n}(x_i-\\bar{x})^2}\\sqrt{\\sum_{i=1}^{n}(y_i-\\bar{y})^2}}\n",
-    "$$\n",
-    "\n",
-    "\n",
-    "\n",
-    "Pearson's correlation coefficient measures linear association between two variables\n",
-    "* range of -1 to 1\n",
-    "* -1 = perfect negative linear correlation\n",
-    "* 1 = perfect positive linear correlation\n",
-    "\n",
-    "source: [Wikipedia](https://en.wikipedia.org/wiki/Pearson_correlation_coefficient#Definition)\n",
-    "\n",
-    "\n",
-    "##### <b>`Pseudocode:`</b> \n",
-    "\n",
-    "<b>`Input:`</b> Two lists of values\n",
-    "\n",
-    "<b>`Parameters:` </b> Just the lists\n",
-    "\n",
-    "<b>`Output:`</b>\n",
-    "\n",
-    "* <strike>variable 1 name, variable 2 name, </strike> r Pearson's correlation coefficient\n",
-    "\n",
-    "\n",
-    "<b>`Steps: `</b>\n",
-    "\n",
-    "1. get data - two lists (x, y)\n",
-    "2. check that input is correct (try / except):\n",
-    "* must be lists\n",
-    "* must only have numbers\n",
-    "* must be the same length\n",
-    "* must be of length > 0\n",
-    "3. calculating pieces for r coeff\n",
-    "* averages of x, y (use FR1)\n",
-    "* iterations:\n",
-    "  * x_diff = $x_i - x_m$\n",
-    "  * y_diff = $y_i - y_m$\n",
-    "* sums of interations :\n",
-    "  * sum_prod_diff += x_diff * y_diff\n",
-    "  * sum_sq_x_diff += x_diff * x_diff\n",
-    "  * sum_sq_y_diff += y_diff * y_diff\n",
-    "\n",
-    "4. calculating r:\n",
-    "* r = sum_prod_diff / sqrt(sum_sq_x_diff * sum_sq_y_diff)\n",
-    "\n",
-    "\n"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {
-    "deletable": false
-   },
-   "outputs": [],
-   "source": [
-    "\n",
-    "\n",
-    "def pearsonCorrCoef(x, y):\n",
-    "    '''\n",
-    "    Function to calculate Pearson's Correlation Coefficient for two lists of numbers.  Returns Pearson's Correlation Coefficient. \n",
-    "    \n",
-    "    r = sum of products of differences / sqrt[(sum of squares of differences x  * sum of squares of differences y)]\n",
-    "\n",
-    "    r = sum[(x - avg_x) * (y - avg_y)] / [sum((x - avg_x) ** 2) * sum((y - avg_y) ** 2] ** 0.5\n",
-    "    '''\n",
-    "    \n",
-    "    try: \n",
-    "        # Check that x and y are lists of the same length, error if not\n",
-    "        \n",
-    "        assert len(x) == len(y)\n",
-    "        n = len(x) # number of data points in lists\n",
-    "        assert n > 0 # check that there is at least one data point\n",
-    "    except AssertionError:\n",
-    "        print(\"Error: x and y MUST be same-length lists of only numbers in order to calculate Pearson's Correlation Coefficient\")\n",
-    "        return None\n",
-    "\n",
-    "    # arithmetic mean of x, y using FR1 function\n",
-    "    avg_x = mean(x)\n",
-    "    avg_y = mean(y)\n",
-    "    \n",
-    "    # initiate variables which will hold sums of columns needed to calculate Pearson's Correlation Coefficient\n",
-    "    # sum of products of differences (x - avg_x) * (y - avg_y)\n",
-    "    # sum of squares of differences (x - avg_x) ** 2\n",
-    "    # sum of squares of differences (y - avg_y) ** 2\n",
-    "\n",
-    "    sum_prod_diff = 0\n",
-    "    sum_sq_x_diff = 0 \n",
-    "    sum_sq_y_diff = 0\n",
-    "        \n",
-    "    #iterate to caculate sums needed - sums of products of differences, sums of squares of differences x, sums of squares of differences y\n",
-    "\n",
-    "    for i in range(n):\n",
-    "        x_diff = x[i] - avg_x\n",
-    "        y_diff = y[i] - avg_y\n",
-    "        sum_prod_diff += x_diff * y_diff\n",
-    "        sum_sq_x_diff += x_diff ** 2\n",
-    "        sum_sq_y_diff += y_diff ** 2\n",
-    "    \n",
-    "    # calculate Pearson's Correlation Coefficient\n",
-    "    r = sum_prod_diff/ (sum_sq_x_diff * sum_sq_y_diff) ** 0.5\n",
-    "    \n",
-    "    #print options based on number of data points\n",
-    "\n",
-    "    if n<5:\n",
-    "        print(\"Pearson's Correlation Coefficient for\", x, \"and\", y, \"is\", round(r,4), \"with\", n, \"data points\")\n",
-    "        #return r\n",
-    "    else:\n",
-    "        print(\"Pearson's Correlation Coefficient for\", str(x[0:4])+\"... and\", str(y[0:4])+\"... is\", round(r,4), \"with\", n, \"data points\")\n",
-    "        #return r\n"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 206,
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "Pearson's Correlation Coefficient for [1, 2, 3, 5] and [1, 5, 7, 8] is 0.8984458631125747 with 4 data points\n"
-     ]
-    },
-    {
-     "data": {
-      "text/plain": [
-       "0.8984458631125747"
-      ]
-     },
-     "execution_count": 206,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "x = [1, 2, 3, 5]\n",
-    "y = [1, 5, 7, 8]\n",
-    "pearsonCorrCoef(x, y)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "deletable": false
-   },
-   "source": [
-    "<p style=\"color:red; font-weight:bold; font-size:xx-small\">MARK: __%</p>\n",
-    "<p style=\"color:red; font-weight:bold; font-size:xx-small\">FEEDBACK: </p>"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "deletable": false
-   },
-   "source": [
-    "### Requirement FR5 - Develop a function to generate a set of Pearson Correlation Coefficients for a given data file "
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "\n",
-    "\n",
-    "\n",
-    "##### <b>`Pseudocode:`</b> \n",
-    "\n",
-    "\n",
-    "<font color = red> Questions: </font> \n",
-    "- is this reading a csv and specifying columns by name as function parameters? \n",
-    "- or somehow return available columns and allowing user to input which ones to use?\n",
-    "- or by column number - maybe combination of both?? \n",
-    "  - tell user columns with col number and ask to input two numbers??\n",
-    "- maybe I could use above function x2 - choose col 1, then choose col 2\n",
-    "- <b> TBD </b>\n",
-    "\n",
-    "<b>`Input:`</b> data read into memory using FR3 (read_csv_into_dictionary); dictionary of colname/data k-v pairs\n",
-    "\n",
-    "<b>`Parameters:` </b> filename\n",
-    "\n",
-    "<b>`Output:`</b>\n",
-    "\n",
-    "* variable 1 name, variable 2 name, r (Pearson's correlation coefficient) for each k-v pair in dictionary\n",
-    "\n",
-    "<b>`Steps: `</b>\n",
-    "\n",
-    "1. read data into memory using `read_csv_into_dictionary` => dictionary of colname, data k-v pairs\n",
-    "2. iterate over dictonary data in for-if loop or list comprehension - \n",
-    "   * iterate to make pairs from dictionary?\n",
-    "   * passing into `pearsonCorrCoeff` function\n",
-    "   * need to modify `pearsonCorrCoeff` for print statements and output.\n",
-    "\n",
-    "\n"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 186,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "x = [1, 2, 3, 5, 2]\n",
-    "y = [1, 5, 7, 8, 5]\n"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {
-    "deletable": false
-   },
-   "outputs": [],
-   "source": [
-    "\n",
-    "\n",
-    "def pearsonCorrCoef_simple(x,y):\n",
-    "    '''\n",
-    "    Function to calculate Pearson's Correlation Coefficient two lists of numbers.  Returns Pearson's Correlation Coefficient. As above, but without print options.\n",
-    "    '''\n",
-    "        \n",
-    "    n = len(x) # number of data points\n",
-    "    \n",
-    "    # arithmetic mean of x, y\n",
-    "    avg_x = mean(x)\n",
-    "    avg_y = mean(y)\n",
-    "    \n",
-    "    # initiate variables holding sums of rows needed to calculate Pearson's Correlation Coefficient\n",
-    "    \n",
-    "    sum_prod_diff = 0\n",
-    "    sum_sq_x_diff = 0 \n",
-    "    sum_sq_y_diff = 0\n",
-    "        \n",
-    "    #iterate to caculate sums needed, as above\n",
-    "\n",
-    "    for i in range(n):\n",
-    "        x_diff = x[i] - avg_x\n",
-    "        y_diff = y[i] - avg_y\n",
-    "        sum_prod_diff += x_diff * y_diff\n",
-    "        sum_sq_x_diff += x_diff ** 2\n",
-    "        sum_sq_y_diff += y_diff ** 2\n",
-    "    \n",
-    "    #calculate Pearson's Correlation Coefficient\n",
-    "    r = sum_prod_diff/ (sum_sq_x_diff * sum_sq_y_diff) ** 0.5\n",
-    "\n",
-    "    return r\n",
-    "    "
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 223,
-   "metadata": {},
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "[-0.3872983346207417, -0.1889822365046136, -0.5367450401216932]"
-      ]
-     },
-     "execution_count": 223,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "dict2 = {'a': [1, 1, 2, 1,3, 4], 'd': [2, 5, 6,4, 3, 2], 'e': [13, 9, 7,6,7,9]}\n",
-    "\n",
-    "data = [i for i in dict2.values()]\n",
-    "variable = [i for i in dict2.keys()]\n",
-    "#[pearsonCorrCoef_simple(data[i], data[j]) for i in range(len(data)) for j in range(i+1,len(data))]\n",
-    "[pearsonCorrCoef_simple(data[i], data[j]) for i in range(len(data)) for j in range(i+1,len(data))]\n",
-    "\n"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 232,
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "a a 1.0\n",
-      "a d -0.3872983346207417\n",
-      "a e -0.1889822365046136\n",
-      "d a -0.3872983346207417\n",
-      "d d 1.0\n",
-      "d e -0.5367450401216932\n",
-      "e a -0.1889822365046136\n",
-      "e d -0.5367450401216932\n",
-      "e e 1.0\n"
-     ]
-    }
-   ],
-   "source": [
-    "for variable in dict2:\n",
-    "    data = dict2[variable]\n",
-    "    #print(variable, data)\n",
-    "    for variable2 in dict2:\n",
-    "        data2 = dict2[variable2]\n",
-    "        #print(variable2, data2)\n",
-    "        #if variable != variable2:\n",
-    "        pearsonCorrCoef_simple(data, data2)\n",
-    "        print(variable, variable2, pearsonCorrCoef_simple(data, data2))\n",
-    "        \n"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 231,
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "a d -0.3872983346207417\n",
-      "a e -0.1889822365046136\n",
-      "d a -0.3872983346207417\n",
-      "d e -0.5367450401216932\n",
-      "e a -0.1889822365046136\n",
-      "e d -0.5367450401216932\n"
-     ]
-    }
-   ],
-   "source": [
-    "for variable in dict2:\n",
-    "    data = dict2[variable]\n",
-    "    #print(variable, data)\n",
-    "    for variable2 in dict2:\n",
-    "        data2 = dict2[variable2]\n",
-    "        #print(variable2, data2)\n",
-    "        if variable != variable2:\n",
-    "            pearsonCorrCoef_simple(data, data2)\n",
-    "            print(variable, variable2, pearsonCorrCoef_simple(data, data2))"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "[('age', 'age', 1.0), ('age', 'pop', -0.026709155288578715), ('age', 'share_white', 0.19960545971348648), ('age', 'share_black', -0.08806861565745706), ('age', 'share_hispanic', -0.13679338429773677), ('age', 'personal_income', 0.03247940547175454), ('age', 'household_income', 0.07122884715312655), ('age', 'poverty_rate', -0.11501578217612785), ('age', 'unemployment_rate', -0.08924108157207414), ('age', 'uni_education_25+\\n', -0.015551355328918883), ('pop', 'age', -0.026709155288578715), ('pop', 'pop', 1.0000000000000002), ('pop', 'share_white', 0.0755124299211167), ('pop', 'share_black', -0.15619530865683967), ('pop', 'share_hispanic', 0.06195481228740331), ('pop', 'personal_income', 0.20485917876690452), ('pop', 'household_income', 0.3051738076744344), ('pop', 'poverty_rate', -0.29132966650032943), ('pop', 'unemployment_rate', -0.21783723363180305), ('pop', 'uni_education_25+\\n', 0.11698059897943787), ('share_white', 'age', 0.19960545971348648), ('share_white', 'pop', 0.0755124299211167), ('share_white', 'share_white', 1.0000000000000002), ('share_white', 'share_black', -0.5449723033049567), ('share_white', 'share_hispanic', -0.5774407139938217), ('share_white', 'personal_income', 0.35839184199994806), ('share_white', 'household_income', 0.32212297679344865), ('share_white', 'poverty_rate', -0.49770809057624715), ('share_white', 'unemployment_rate', -0.3896695027097979), ('share_white', 'uni_education_25+\\n', 0.33416476681871427), ('share_black', 'age', -0.08806861565745706), ('share_black', 'pop', -0.15619530865683967), ('share_black', 'share_white', -0.5449723033049567), ('share_black', 'share_black', 1.0), ('share_black', 'share_hispanic', -0.2624176266788398), ('share_black', 'personal_income', -0.2824788256535901), ('share_black', 'household_income', -0.34673961691580957), ('share_black', 'poverty_rate', 0.4306656974717496), ('share_black', 'unemployment_rate', 0.4836283024208505), ('share_black', 'uni_education_25+\\n', -0.2129601024183306), ('share_hispanic', 'age', -0.13679338429773677), ('share_hispanic', 'pop', 0.06195481228740331), ('share_hispanic', 'share_white', -0.5774407139938217), ('share_hispanic', 'share_black', -0.2624176266788398), ('share_hispanic', 'share_hispanic', 1.0000000000000002), ('share_hispanic', 'personal_income', -0.2231256947246905), ('share_hispanic', 'household_income', -0.13596088920701366), ('share_hispanic', 'poverty_rate', 0.20829495353292043), ('share_hispanic', 'unemployment_rate', 0.014748972805766968), ('share_hispanic', 'uni_education_25+\\n', -0.2909783037426069), ('personal_income', 'age', 0.03247940547175454), ('personal_income', 'pop', 0.20485917876690452), ('personal_income', 'share_white', 0.35839184199994806), ('personal_income', 'share_black', -0.2824788256535901), ('personal_income', 'share_hispanic', -0.2231256947246905), ('personal_income', 'personal_income', 1.0), ('personal_income', 'household_income', 0.8319631230491987), ('personal_income', 'poverty_rate', -0.6959234082905974), ('personal_income', 'unemployment_rate', -0.5049325127827728), ('personal_income', 'uni_education_25+\\n', 0.7166080399373852), ('household_income', 'age', 0.07122884715312655), ('household_income', 'pop', 0.3051738076744344), ('household_income', 'share_white', 0.32212297679344865), ('household_income', 'share_black', -0.34673961691580957), ('household_income', 'share_hispanic', -0.13596088920701366), ('household_income', 'personal_income', 0.8319631230491987), ('household_income', 'household_income', 1.0), ('household_income', 'poverty_rate', -0.7541757449430393), ('household_income', 'unemployment_rate', -0.5099954109970329), ('household_income', 'uni_education_25+\\n', 0.6729008330623418), ('poverty_rate', 'age', -0.11501578217612785), ('poverty_rate', 'pop', -0.29132966650032943), ('poverty_rate', 'share_white', -0.49770809057624715), ('poverty_rate', 'share_black', 0.4306656974717496), ('poverty_rate', 'share_hispanic', 0.20829495353292043), ('poverty_rate', 'personal_income', -0.6959234082905974), ('poverty_rate', 'household_income', -0.7541757449430393), ('poverty_rate', 'poverty_rate', 1.0000000000000002), ('poverty_rate', 'unemployment_rate', 0.5916868406520023), ('poverty_rate', 'uni_education_25+\\n', -0.46033635796187616), ('unemployment_rate', 'age', -0.08924108157207414), ('unemployment_rate', 'pop', -0.21783723363180305), ('unemployment_rate', 'share_white', -0.3896695027097979), ('unemployment_rate', 'share_black', 0.4836283024208505), ('unemployment_rate', 'share_hispanic', 0.014748972805766968), ('unemployment_rate', 'personal_income', -0.5049325127827728), ('unemployment_rate', 'household_income', -0.5099954109970329), ('unemployment_rate', 'poverty_rate', 0.5916868406520023), ('unemployment_rate', 'unemployment_rate', 1.0), ('unemployment_rate', 'uni_education_25+\\n', -0.4663884373739078), ('uni_education_25+\\n', 'age', -0.015551355328918883), ('uni_education_25+\\n', 'pop', 0.11698059897943787), ('uni_education_25+\\n', 'share_white', 0.33416476681871427), ('uni_education_25+\\n', 'share_black', -0.2129601024183306), ('uni_education_25+\\n', 'share_hispanic', -0.2909783037426069), ('uni_education_25+\\n', 'personal_income', 0.7166080399373852), ('uni_education_25+\\n', 'household_income', 0.6729008330623418), ('uni_education_25+\\n', 'poverty_rate', -0.46033635796187616), ('uni_education_25+\\n', 'unemployment_rate', -0.4663884373739078), ('uni_education_25+\\n', 'uni_education_25+\\n', 1.0000000000000002)]\n"
-     ]
-    }
-   ],
-   "source": [
-    "\n",
-    "results = [(variable, variable2, pearsonCorrCoef(my_dict[variable], my_dict[variable2]))\n",
-    "           for variable in my_dict\n",
-    "           for variable2 in my_dict]\n",
-    "print(results)\n",
-    "\n"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "my_dict = read_csv_into_dictionary('task1.csv')\n",
-    "\n",
-    "\n",
-    "\n",
-    "for variable in my_dict:\n",
-    "    data = my_dict[variable]\n",
-    "    \n",
-    "\n",
-    "    for variable2 in my_dict:\n",
-    "        data2 = my_dict[variable2]            \n",
-    "        r = PCC_simple(data, data2)\n",
-    "\n",
-    "        print([(variable, variable2, r)])\n",
-    "        \n",
-    "       \n"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "a d -0.3872983346207417\n",
-      "a e -0.1889822365046136\n",
-      "d a -0.3872983346207417\n",
-      "d e -0.5367450401216932\n",
-      "e a -0.1889822365046136\n",
-      "e d -0.5367450401216932\n"
-     ]
-    }
-   ],
-   "source": [
-    "for variable in dict2:\n",
-    "    data = dict2[variable]\n",
-    "    #print(variable, data)\n",
-    "    for variable2 in dict2:\n",
-    "        data2 = dict2[variable2]\n",
-    "        #print(variable2, data2)\n",
-    "        if variable != variable2:\n",
-    "            pearsonCorrCoef_simple(data, data2)\n",
-    "            print(variable, variable2, pearsonCorrCoef_simple(data, data2))"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {
-    "deletable": false
-   },
-   "outputs": [],
-   "source": [
-    "# replace with your code"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "deletable": false
-   },
-   "source": [
-    "<p style=\"color:red; font-weight:bold; font-size:xx-small\">MARK: __%</p>\n",
-    "<p style=\"color:red; font-weight:bold; font-size:xx-small\">FEEDBACK: </p>"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "### <b> delete </b>\n",
-    "\n",
-    "* output (see appendix 4)\n",
-    "  * input: \n",
-    "  * print list of columns (variables) available in csv/data??\n",
-    "  * user input by name?? or colnumber??  confirmation??\n",
-    "\n",
-    "* output: \n",
-    "  * return results as input, with nicely bordered tables - old-school dot-matrix print outs (like r)\n",
-    "\n",
-    "\n"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "deletable": false
-   },
-   "source": [
-    "### Requirement FR6 - Develop a function to print a custom table"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "##### <b>`Pseudocode:`</b> \n",
-    "\n",
-    "\n",
-    "<font color = red> Questions: </font> \n",
-    "\n",
-    "* are there any limitations to character usage - probably not\n",
-    "* what is the source of the list of tuples?  - probably doesn't matter - i.e. output from FR5 or variable storing list of tuples or list of tuples\n",
-    "* columns to include?? - not sure what this means yet\n",
-    "\n",
-    "* doesn't the list of tuples parameter determine which columns are included?  \n",
-    "  * may need to clarify this!\n",
-    "\n",
-    "- <b> TBD </b>\n",
-    "\n",
-    "<b>`Input:`</b> \n",
-    "\n",
-    "* list of tuples, e.g.\n",
-    "  * var1, var1, pcc11\n",
-    "  * var1, var2, pcc12\n",
-    "  * var1, var3, pcc13\n",
-    "  * var1, var4, pcc14\n",
-    "  * var2, var3, pcc23\n",
-    "  * var2, var4, pcc\n",
-    "  * var3, var4, pcc\n",
-    "  * var4, var4, pcc\n",
-    "\n",
-    "|    |     |     |     |\n",
-    "|----|-----|-----|-----|\n",
-    "|    |var1 |var2 |var3 |\n",
-    "|var1|pcc11|pcc12|pcc13|\n",
-    "|var2|pcc21|pcc22|pcc23|\n",
-    "|var3|pcc31|pcc32|pcc33|\n",
-    "\n",
-    "tuple = [(var1, var1, pcc11), (var1, var2, pcc12)]\n",
-    "variable list = [var1, var2, var3]\n",
-    "\n",
-    "row header = iterate over variable list\n",
-    "\n",
-    "\n",
-    "\n",
-    "\n",
-    "\n",
-    "<b>`Parameters:` </b> \n",
-    "* source of list of correlation coefficient tuples\n",
-    "* border character to use\n",
-    "* columns to include\n",
-    "\n",
-    "<b>`Output:`</b>\n",
-    "\n",
-    "* nicely printed ascii / dot matrix style table\n",
-    "* formatted\n",
-    "* padded / spaced\n",
-    "* legible\n",
-    "* footer??\n",
-    "* title??\n",
-    "\n",
-    "<b>`Steps: `</b>\n",
-    "\n",
-    "1. No idea, yet...read about string formatting, probably.\n",
-    "2. list of tuples\n",
-    "   * get variables into list (for header row and first element of each data row)\n",
-    "3. enumerate over variable list for header row\n",
-    "4. maybe get data into the right format to iterate over rows\n",
-    "\n",
-    "first unique element of each tuple = column headers\n",
-    "iterate over elements 2, 3 for each tuple to print table\n",
-    "\n",
-    "\n",
-    "\n",
-    "READ : https://docs.python.org/3/library/string.html#formatspec"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 281,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "#test tuple based on appendix example\n",
-    "\n",
-    "tup_list = [('Glucose', 'Glucose', 1), ('BP', 'Glucose', 0.1429), ('BMI', 'Glucose', 0.0584), ('Age', 'Glucose', 0.5328), ('Glucose','BP', 0.1429), ('BP', 'BP', 1), ('BMI', 'BP', -0.4522), ('Age', 'BP', 0.4194), ('Glucose', 'BMI', 0.0584), ('BP', 'BMI', -0.4522), ('BMI', 'BMI', 1), ('Age', 'BMI', -0.3847), ('Glucose', 'Age', 0.5328), ('BP', 'Age', 0.4194), ('BMI', 'Age', -0.3847), ('Age', 'Age', 1)]"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 337,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "#pull out first element to use for column header - i.e. first row of table\n",
-    "\n",
-    "def unique_first_element(tup_list):\n",
-    "    '''\n",
-    "    Function to return a list of unique first elements in a list of tuples\n",
-    "    '''\n",
-    "    unique_list = []\n",
-    "    for elem in tup_list:\n",
-    "        if elem[0] not in unique_list:\n",
-    "            unique_list.append(elem[0])\n",
-    "    return unique_list\n",
-    "\n",
-    "unique_first_element(tup_list)\n",
-    "    "
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 343,
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "                 Glucose          BP         BMI         Age\n"
-     ]
-    }
-   ],
-   "source": [
-    "\n",
-    "\n",
-    "header_row = \"{:>12}\" * (len(unique_first_element(tup_list))+1)\n",
-    "print(header_row.format(\" \", *unique_first_element(tup_list)))\n",
-    "\n",
-    "\n",
-    "            "
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 263,
-   "metadata": {},
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "['A', 'C', 'E', 'G', 'J']"
-      ]
-     },
-     "execution_count": 263,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "def create_table(listTuples, row_labels, col_labels):\n",
-    "    for row_index, label in enumerate(row_labels):\n",
-    "        print(\"{0: <20}\".format(label), end = '')\n",
-    "        for column in listTuples:\n",
-    "            print(\"{0: <20}\".format(column[row_index]), end = '')\n",
-    "        print()\n",
-    "\n",
-    "\n",
-    "\n",
-    "variables =  [var[0] for var in tup_list]\n",
-    "variables\n",
-    "#create_table(tup_list, variables, variables)\n"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "deletable": false
-   },
-   "source": [
-    "<p style=\"color:red; font-weight:bold; font-size:xx-small\">MARK: __%</p>\n",
-    "<p style=\"color:red; font-weight:bold; font-size:xx-small\">FEEDBACK: </p>"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "deletable": false
-   },
-   "source": [
-    "# Coding Standards\n",
-    "<p style=\"color:red; font-weight:bold; font-size:xx-small\">MARK: __%</p>\n",
-    "<p style=\"color:red; font-weight:bold; font-size:xx-small\">FEEDBACK: </p>"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "deletable": false
-   },
-   "source": [
-    "# Process Development Report for Task 1\n"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "Draft version below - 966 words (including everything)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "•\tYou are expected to identify the strengths/weaknesses of your approach. For this programming task, you are expected to write a reflective report which focuses on the process taken to develop a solution to the task. Please reflect on your experiences rather than simply describing what you did. The report should: \n",
-    "o\tinclude an explanation of how you approached the task.\n",
-    "o\tidentify any strengths/weaknesses of the approach used.\n",
-    "o\tconsider how the approach used could be improved.\n",
-    "o\tsuggest alternative approaches that could have been taken instead of the one you used.\n",
-    "\n",
-    "o\tup to 8 marks for the Development Process Report\n",
-    "\tMarks will be awarded for appropriate use of technical language, critical reflection on development process and quality of engagement with the reflective process\n",
-    "\n",
-    "word count - 500 max\n",
-    "\n",
-    "\n"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "FR1: arithmetic mean\n",
-    "\n",
-    "* easy to implement - good confidence boost\n",
-    "* allows for getting into a flow of coding, including comments, docstrings, naming conventions (variables and functions), testing, etc.\n",
-    "* I kept returning to the naming and commenting of early requirements - as they developed.  \n",
-    "* Challenge was getting the balance right - on the one hand, I wanted to ensure that I was including adequate comments but on the other hand, I did not want to over document.  Bearing in mind that this is an assessed element of coursework submitted for a master's module, I think I have the right balance.  In 'the real world,' I would probably shorten some of the docstrings and remove some in line comments.  \n",
-    "* My tactic has been to try to name variables and functions in an unambiguous, meaningful manner so that documentation is unnecessary.\n",
-    "* I haven't settled on a style yet - e.g. casing (proper, camel, etc.) and use of underscores.  \n",
-    "* When writing functions, I wanted them to have reuse value - that is, to keep them general.  With some requirements, this was not possible - some of my ideas would have deviated from the brief.  Examples later.  \n",
-    "* I wanted to use helpful python code where relevant, e.g. try/except blocks with specific (anticipated) errors captured.  \n",
-    "\n",
-    "* WHat I would do differently:\n",
-    "  * I need to be mindful of using restricted/reserved words as variable names.  I have used 'list' in FR1 function.  This is not a problem in this instance but it could be in others.  I will need to be more careful in future.\n",
-    "  * I also would like to assess the efficiency of my functions - there are so many ways to write functions producing the same outcomes in python.  Absolutely amazed at how flexible and powerful it is - but at the moment, I am pleased when I can get my code to work.  Efficiency, elegance, etc. will come with time and experience - but I could start using time to assess processing times for different approaches. \n",
-    "\n",
-    "FR2: read single column from CSV file\n",
-    "\n",
-    "* Again, I wanted to make a reusable function, so I added 'delimiter' as an optional parameter.  I prefer to use pipes (|) as delimiters as they are very seldom used in text.  This is something I have found handy in data migration work I have undertaken in my current job.  Quite often - common characters like commas and 'invisible' spacing characters like tabs, new line, carriage returns, etc. can wreak havoc when migrating data.  \n",
-    "* I liked this function as it was a good opportunity to use the 'with' statement.  I like that it 'closes' the file when finished, without a need for an explicit close statement.  \n",
-    "* The challenge for me in this requirement was working out how to extract the header row and then proceed to the data rows.  \n",
-    "* It was a good example of where I knew what I wanted to do and it seemed simple enough, but took some trial and error before I got there (plenty more to come!)\n",
-    "* Also added a specific FileNotFoundError exception as I expect this to be the most common error.\n",
-    "* I also added a if/else statement to account for whether the file has a header - returning results accordingly.  While this is not in the brief, I was thinking about a future proofed function which can handle both varieties.\n",
-    "\n",
-    "FR3: read CSV data from a file into memory\n",
-    "\n",
-    "* Much of the above applies - the function docstring is probably a bit long but I wanted to be clear and complete in my documentation.\n",
-    "* This function builds on FR2 - but returns a dictionary by iterating through the data rows of the file.  \n",
-    "* I worked through many versions of this - for loops, enumerate, etc. - but settled for a list comprehension, once I get it working the way it should.  I am not sure if it is the most efficient way to do it, but it works.  However, I like the conciseness of list comprehensions - although they may not be as readable as a for loop.\n",
-    "* In terms of variable names, I am not always clear on the standards - i, j, row, line, etc. and I vary between them - which annoys me.  Consistency, clarity, transparency, etc. are very important to me. I am not so happy with 'variable_data_dict' as a name, but couldn't find anything better - 'my_dict', 'data_dict', etc. - I am critical of them all. \n",
-    "\n",
-    "\n",
-    "* one problem I had with FR3 was discovered when using the function in various of FR4.  The original dictionary produced string key-value pairs - which when printing looked fine to me.  It took me some time to realise that the values were string - that they were enclosed in single quotation marks.  This was certainly a source of frustration - when something simply was not behaving the way I was anticipating.\n",
-    "  * However, once I noticed, I tried to amend the list comprehension by convernting to integers with int() - which caused an error - \"invalid literal for int() with base 10: '60.5'\"\n",
-    "  * Therefore I converted the data to floats - which may not be quite right for ages, but will not impact the calculations.\n",
-    "* \n",
-    "\n",
-    "FR4: Pearson Correlation Coefficient\n",
-    "\n",
-    "* I spent a lot of time on this function and tried to make it work using several methodologies.  \n",
-    "* Pseudo code was particularly helpful in this instance, but I have a tendency of starting with pseudo code and then deviating into a confused space of trial and error rather than updating the pseudo code.  Definitely something I can work on and take away into my current day job. \n",
-    "* Firstly I needed to remind myself how to calculate Pearson's r having only have had it calculated for me in SPSS or R.  There a couple of approaches - but the strategy is to break down the formula into standalone calculations and build the function to produce these before putting it all together to return R.  \n",
-    "* I spent a lot of time trying different approaches.  Again, like last time I started with for loops and enumerate() but found that I couldn't always achieve my intentions - that I would get slightly lost in the hierarchy and multilined approach, so I used list comprehensions instead and in the end, I am pleased with the result.\n",
-    "\n",
-    "* As previously, I tried thinking about generalising and future-proofing the function for future use and that includes considering what tests / checks / validation I could put in place.  The most obvious source of error will be the csv file so I added some assertions to ensure that the csv file has the correct data before proceeding to calculate Pearson's r.  \n",
-    "* I tested my version of Pearsons with numpy corrcoef() and it produces the same results for a small data set, which gives you that excitement of having achieved your intentions.\n",
-    "\n",
-    "\n",
-    "FR5: Pearson Correlation Coefficients for a file\n",
-    "\n",
-    "* This function builds on FR4 by calculating Pearson's r for each pair of columns in a csv file.\n",
-    "* I enjoyed using FR3 and FR4 (which in turn use FR1) within this function.  That is pretty cool.\n",
-    "* It does not have many lines of code but it took me a while to get it working.  I had to think about how to iterate through the columns of the csv in terms of what has been returned by FR3 and FR4 and then getting the column names into a list of tuples.\n",
-    "* It was not plain sailing and there was a fair bit of frustration - another good example of where I knew exactly what I wanted to do but it was hard to get there.  I had a few versions on the go - I think I could have benefited by sticking with one and updating the pseudo code.  \n",
-    "* As previously, I moved from for loops to list comprehensions as I had better success getting the results I wanted, especially returning lists of tuples.  Initially, I was returning individual lists instead of lists of tuples.  \n",
-    "* The joy of success after some struggle can be very satisfying. \n",
-    "\n",
-    "I would probably look to round some of the output in future iterations.\n",
-    "Actually - I just added the rounding now!\n",
-    "\n",
-    "And finally\n",
-    "\n",
-    "FR6: printing the results\n",
-    "\n",
-    "* I found this FR very challenging and a bit frustrating.\n",
-    "* I could have benefitted significantly by sticking with my pseudo code and keeping it up to date - but once I got stuck into a cycle of trial and error, minor success and subsequent set-back, I did not rever to pseudocode and focused on cracking the issue which was hampering me.  \n",
-    "* This function was created through trial and error - entirely\n",
-    "* I broke it down into separate sections and tried to think how I would iterate through the list of tuples to get a decent end result.\n",
-    "* Some sections worked, others didn't, changes then impacted the rest of the function and I was back to square one - hardforking!\n",
-    "* Towards the end, I had functioning portions - the top table header, the column headers, the rows headers, the r-values.  This was a mess of for loops, list comprehensions, and string formatting which I needed to work through step by step.\n",
-    "* I found cells in Jupyter notebooks to be invaluable - the ability to run portions of code and easily move it around is very effective and powerul.  I also found the ability to comment out sections of code very useful - I could easily comment out sections of code and then uncomment them to see how they impacted the rest of the function.\n",
-    "* Finally, the pieces (rows of code) were lining up - at this point, I resorted to tweaking and assessing the impact of each change.  I discovered that I needed data specific to the input table - rather than fixed widths - so created a separate function to calculate max column widths and then played with that to get the table to look passable.  \n",
-    "\n",
-    "Whew - got there in the end!  \n",
-    "\n",
-    "Although this was more challenging to get motivated over as I am sure I would use some library to print tables without the effort and challenge - i.e. pandas, prettytable, etc.  However, in hindsight I am glad I persevered as I have learned a lot about string formatting and how to iterate through lists of tuples.  It allowed me to trouble-shoot, iterate through a requirement, reassess and rework through many cycles.  I am sure I will use this knowledge in the future.\n"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "add markdown text here\n",
-    "\n",
-    "coding - use of x, y, common practices, etc.\n",
-    "naming conventions (camel, underscore, etc.)\n",
-    "level/amount of commenting\n",
-    "efficiency of code\n",
-    "\n",
-    "\n",
-    "Thought I was attentive to detail\n",
-    "Eg print statements \n",
-    "\n",
-    "Challenging to get the iteration done\n",
-    "Know what I want to do but achieving it programmatically is hard and slow\n",
-    "\n",
-    "Balancing the need to know  vs how to find out what I need to know\n",
-    "For example researching code, assessing, evaluating, understanding, not assuming. \n",
-    "\n",
-    "Knowing fundamentals but not everything. Realistically would use libraries where they do what you need them to. Maybe not most efficient, but most practical or ease of use or pretty. Maybe not quickest run time but quicker to get up and running. There's a balance to be struck - knowing what your requirements are and adapting to them...metaphors I use - chefs and recipes and ingredients and cake shops.  Sometimes you need to buy the cake pre-made and decorated, sometimes you make it from scratch, sometimes you buy it and decorate it, sometimes you farm the eggs.  \n",
-    "\n",
-    "Or building a shed: Building shed analogy \n",
-    "\n",
-    "Don't need to know how to build the pieces individually\n",
-    "Carpentry power tools skills\n",
-    "May get a better product\n",
-    "May get exactly what I want\n",
-    "But requires a sig investment and aptitude whereas can by prefab, deconstructed where hard work done. You can focus on the outcome and the nuance and styling\n",
-    "\n",
-    "Or decorating a room - v being interior designer\n",
-    "\n",
-    "Coding and documenting. Importance of. Dry wet. Balance. "
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "### <b> Introduction </b>\n",
-    "\n",
-    "The purpose of this report is to provide a short, critical self-assessment of all aspects of the code development process undertaken for Task 1 of the assessed coursework for `UFCFVQ-15-M Programming for Data Science`.  The report will include a brief description of the code, a reflection on the development process and an evaluation of the code, including strengths and weaknesses and alternatives to consider.  \n",
-    "\n",
-    "\n",
-    "### <b> Code Description </b>\n",
-    "\n",
-    "The requirement for `Task_1` is to calculate Pearson Correlation Coefficients (PCCs) for all pairs of variables in a given data file, without using any external libraries such as Pandas, and print a decent-looking, customisable table of the results.  \n",
-    "\n",
-    "The code was written in an interactive Jupyter notebook using a Python 3.11 kernel.  \n",
-    "\n",
-    "The coursework is structured so that the code must be developed in a series of steps, each building on or incorporating its predecessor.  There are six functional requirements (FRs) as follows:\n",
-    "\n",
-    "| FR  | Description           |\n",
-    "|-----|-----------------------|\n",
-    "| FR1 | Arithmetic mean       |   \n",
-    "| FR2 | Read column from file |   \n",
-    "| FR3 | Read file             |   \n",
-    "| FR4 | PCC for two lists     |   \n",
-    "| FR5 | PCC for file          |   \n",
-    "| FR6 | Print table           |   \n",
-    "\n",
-    "\n",
-    "Pearson's correlation coefficient measures linear association between two variables, ranging from -1 to 1, where -1 is perfect negative linear correlation and 1 is perfect positive linear correlation.  The formula for calculating the PCC is as follows:\n",
-    "\n",
-    "Given paired data, consisting of $n$ pairs: \n",
-    "$$\n",
-    "\n",
-    "r =\n",
-    "  \\frac{ \\sum_{i=1}^{n}(x_i-\\bar{x})(y_i-\\bar{y}) }{%\n",
-    "        \\sqrt{\\sum_{i=1}^{n}(x_i-\\bar{x})^2}\\sqrt{\\sum_{i=1}^{n}(y_i-\\bar{y})^2}}\n",
-    "$$\n",
-    "\n",
-    "\n",
-    "source: [Wikipedia](https://en.wikipedia.org/wiki/Pearson_correlation_coefficient#Definition)\n",
-    "\n",
-    "\n",
-    " \n",
-    "### <b> Reflection on Development Process</b>\n",
-    "\n",
-    "My development process made use of the inherent stratified nature of the task, allowing me to plan, develop and test each FR independently in separate code cells, before combining where appropriate.  I found ths particularly useful for the more complex FRs - 4, 5, and 6 which required significant iteration, investigation and testing before achieving the desired results.  This also worked very well in terms of source and version control, complementing the use of Git.\n",
-    "\n",
-    "My strategy was to use a localised crisp-dm approach which included initial articulation of requirements followed by iterative pseudocode, Python code and testing cycles until the desired results were achieved.  I find this approach very effective and it is one I use in my current employment.  However, having said that, I find that I can occasionally spiral out of control in the testing phase, which can be time-consuming, frustrating and ultimately less productive.  I will continue to try to be mindful of this. \n",
-    "\n",
-    "\n",
-    "[create and insert a crisp-dm diagram here]\n",
-    "\n",
-    "I tried to be disciplined in my approach and for the most part feel like I was successful.  I made deliberate attempts to make use of newly acquired tools and techniques like Git, VS Code, Jupyter notebooks, Markdown - ensuring that I commented on my code, made use of Markdown cells during the development phase, and committing changes to Git in appropriate intervals.\n",
-    "\n",
-    "\n",
-    "### <b> Code Evaluation </b>\n",
-    "\n",
-    "When it comes to code evaluation, I think it is important to acknowledge that the Task_1 code has been developed to meet the coursework specification.  As such, it has been produced with certain limitations and constraints and does not necessarily reflect my approach in a different context, such as my employment or personal projects.  However, it is important to reflect on what has been produced.  \n",
-    "\n",
-    "Overall, I am pleased with the code that I have produced.  It achieves the requirements (as I have interpreted them) and it <i>feels</i> fairly efficient and robust.  \n",
-    "\n",
-    "I tried to be mindful of certain principles when undertaking this work:\n",
-    "\n",
-    "* Future-proofing the functions, i.e. making them flexible and adaptable to different data types and formats, perhaps accepting different parameters, etc. \n",
-    "* Thinking about likely pitfalls from the users perspective and pre-empting likely errors with assertions and exception handling\n",
-    "* Unambiguous, self-explanatory naming of functions and variables, where possible\n",
-    "* Balanced comments and docstrings - considering approaches like DRY (Dont Repeat Yourself), WET (Write Everything Twice), KISS (Keep it Simple, Stupid) bearing in mind possible readers of the code\n",
-    "\n",
-    "I think I have been reasonably successful in achieving these goals.  I have tried to make the code as flexible as possible, accepting different parameters and data types, and I have tried to make the code as robust as possible, using assertions and exception handling to pre-empt likely errors.  I have saved alternate versions of the functions for future use and reference, which take additional parameters or employ a different approach. \n",
-    "\n",
-    "In terms of naming convention, documentation and comments, I think I struck the right balance.  If anything, I have probably over-commented slightly but this is because I am erring on the side of caution.  I have yet to settle on a naming convention and standard and find that I flipflop between different methods like camelCase, snake_case, PascalCase, etc.  I am also aware that there seem to be conventions for variables and functions, especially iterables, which I am not fully using. \n",
-    "\n",
-    "As for the code itself, I think it is fairly efficient and robust.  Of course, there is always room for improvement and I would very much like to know of any suggestions or improvements, or better approaches.\n",
-    "\n",
-    "\n",
-    "#### <b> Summary </b>\n",
-    "\n",
-    "Whilst I have a background in elements of data and systems, I am only getting started in the programming part of this journey.  I am super keen to learn for my personal and professional development and want to pick up best practices, adopt standard approaches and avoid pitfalls and repeating mistakes.  When it comes to Python, I am amazed at how many different ways there are to solve a particular scenario.  This can make it impossible to identify a 'single best approach' - something I will need to get used to and embrace.  \n",
-    "\n",
-    "\n"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "deletable": false
-   },
-   "source": [
-    "<p style=\"color:red; font-weight:bold; font-size:xx-small\">MARK: __%</p>\n",
-    "<p style=\"color:red; font-weight:bold; font-size:xx-small\">FEEDBACK: </p>"
-   ]
-  }
- ],
- "metadata": {
-  "kernelspec": {
-   "display_name": "Python 3.11.0 64-bit",
-   "language": "python",
-   "name": "python3"
-  },
-  "language_info": {
-   "codemirror_mode": {
-    "name": "ipython",
-    "version": 3
-   },
-   "file_extension": ".py",
-   "mimetype": "text/x-python",
-   "name": "python",
-   "nbconvert_exporter": "python",
-   "pygments_lexer": "ipython3",
-   "version": "3.11.0"
-  },
-  "vscode": {
-   "interpreter": {
-    "hash": "3a85823825384e2f260493b9b35c69d8eaac198ff59bb0d6c0e72fffbde301e2"
-   }
-  }
- },
- "nbformat": 4,
- "nbformat_minor": 4
-}
diff --git a/UFCFVQ-15-M_Programming_Task_1_submit.ipynb b/UFCFVQ-15-M_Programming_Task_1_submit.ipynb
index faf5866758c549f34f994c0e63c3b7e1934bedcf..084d3834962d26a427d1a9cc6b4dc87a41bf353c 100644
--- a/UFCFVQ-15-M_Programming_Task_1_submit.ipynb
+++ b/UFCFVQ-15-M_Programming_Task_1_submit.ipynb
@@ -380,36 +380,13 @@
     "### Requirement FR6 - Develop a function to print a custom table"
    ]
   },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "DELETE this cell and the next cell before submitting\n",
-    "\n",
-    "TODO\n",
-    "\n",
-    "Test with FR5 output\n"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 13,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "#DELETE THIS CELL BEFORE SUBMISSION\n",
-    "# \n",
-    "# test tuple based on appendix example\n",
-    "\n",
-    "tup_list = [('Glucose', 'Glucose', 1), ('BP', 'Glucose', 0.1429), ('BMI', 'Glucose', 0.0584), ('Age', 'Glucose', 0.5328), ('Glucose','BP', 0.1429), ('BP', 'BP', 1), ('BMI', 'BP', -0.4522), ('Age', 'BP', 0.4194), ('Glucose', 'BMI', 0.0584), ('BP', 'BMI', -0.4522), ('BMI', 'BMI', 1), ('Age', 'BMI', -0.3847), ('Glucose', 'Age', 0.5328), ('BP', 'Age', 0.4194), ('BMI', 'Age', -0.3847), ('Age', 'Age', 1)]"
-   ]
-  },
   {
    "cell_type": "code",
    "execution_count": 20,
    "metadata": {},
    "outputs": [],
    "source": [
+    "# Function to calculate max column width, used in FR6\n",
     "def max_col_width(tup_list):\n",
     "    '''\n",
     "    Function to calculate the maximum column width for a list of tuples'''\n",
@@ -538,49 +515,18 @@
     "# Process Development Report for Task 1\n"
    ]
   },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "Plan / Outline: \n",
-    "\n",
-    "`Introduction:` overview of the purpose and objectives of the code, as well as the context in which it was developed (e.g., as part of assessed coursework).\n",
-    "\n",
-    "`Description of the code:` detailed description of the code, including its main features, functionality, and any relevant technical details (e.g., libraries used).\n",
-    "\n",
-    "`Reflection on the development process:` reflect on experience of developing the code, including any challenges and overcoming; discuss any lessons learned or insights gained during the development process.\n",
-    "\n",
-    "`Evaluation of the code:` evaluate the code based on its functionality, performance, and overall quality. discuss limitations or areas for improvement.\n",
-    "\n",
-    "`Conclusion:` summarise the key points of reflection, provide any final thoughts or recommendations for future work.\n",
-    "\n",
-    "\n",
-    "Version 1 below - 966 words (total with everything)\n",
-    "\n",
-    "TODO\n",
-    "\n",
-    "Review version 1\n",
-    "Review notes (in non-submit version)\n",
-    "Reduce in word count\n",
-    "\n"
-   ]
-  },
   {
    "cell_type": "markdown",
    "metadata": {},
    "source": [
     "### <b> Introduction </b>\n",
     "\n",
-    "The purpose of this report is to provide a short, critical self-assessment of all aspects of the code development process undertaken for Task 1 of the assessed coursework for `UFCFVQ-15-M Programming for Data Science`.  The report will include a brief description of the code, a reflection on the development process and an evaluation of the code, including strengths and weaknesses and alternatives to consider.  \n",
-    "\n",
+    "The purpose of this report is to provide a short, critical self-assessment of my code development process for Task 1 of the coursework for `UFCFVQ-15-M Programming_for_Data_Science`.  \n",
     "\n",
     "### <b> Code Description </b>\n",
+    "Task_1 requires writing functions in order, to ultimately calculate Pearson’s Correlation Coefficients (PCCs) for pairs of variables in a given data file, without using imported Python libraries, and printing a decent-looking table.  \n",
     "\n",
-    "The requirement for `Task_1` is to calculate Pearson Correlation Coefficients (PCCs) for all pairs of variables in a given data file, without using any external libraries such as Pandas, and print a decent-looking, customisable table of the results.  \n",
-    "\n",
-    "The code was written in an interactive Jupyter notebook using a Python 3.11 kernel.  \n",
-    "\n",
-    "The coursework is structured so that the code must be developed in a series of steps, each building on or incorporating its predecessor.  There are six functional requirements (FRs) as follows:\n",
+    "Functional requirements (FRs):\n",
     "\n",
     "| FR  | Description           |\n",
     "|-----|-----------------------|\n",
@@ -591,57 +537,51 @@
     "| FR5 | PCC for file          |   \n",
     "| FR6 | Print table           |   \n",
     "\n",
+    "The code was developed in a Jupyter notebook using a Python 3.11 kernel.  \n",
     "\n",
-    "Pearson's correlation coefficient measures linear association between two variables, ranging from -1 to 1, where -1 is perfect negative linear correlation and 1 is perfect positive linear correlation.  The formula for calculating the PCC is as follows:\n",
     "\n",
-    "Given paired data, consisting of $n$ pairs: \n",
-    "$$\n",
+    "### <b> Development Process</b>\n",
     "\n",
-    "r =\n",
-    "  \\frac{ \\sum_{i=1}^{n}(x_i-\\bar{x})(y_i-\\bar{y}) }{%\n",
-    "        \\sqrt{\\sum_{i=1}^{n}(x_i-\\bar{x})^2}\\sqrt{\\sum_{i=1}^{n}(y_i-\\bar{y})^2}}\n",
-    "$$\n",
+    "My development process made use of the task’s inherent structure, allowing me to plan, develop and test each FR independently, before combining as needed.  This was especially useful for more complex FRs, which required significant iteration and testing before achieving the desired results.\n",
     "\n",
     "\n",
-    "source: [Wikipedia](https://en.wikipedia.org/wiki/Pearson_correlation_coefficient#Definition)\n",
-    "\n",
-    "\n",
-    " \n",
-    "### <b> Reflection on Development Process</b>\n",
-    "\n",
-    "My development process made use of the inherent stratified nature of the task, allowing me to plan, develop and test each FR independently in separate code cells, before combining where appropriate.  I found ths particularly useful for the more complex FRs - 4, 5, and 6 which required significant iteration, investigation and testing before achieving the desired results.  This also worked very well in terms of source and version control, complementing the use of Git.\n",
-    "\n",
-    "My strategy was to use a localised crisp-dm approach which included initial articulation of requirements followed by iterative pseudocode, Python code and testing cycles until the desired results were achieved.  I find this approach very effective and it is one I use in my current employment.  However, having said that, I find that I can occasionally spiral out of control in the testing phase, which can be time-consuming, frustrating and ultimately less productive.  I will continue to try to be mindful of this. \n",
-    "\n",
-    "\n",
-    "[create and insert a crisp-dm diagram here]\n",
-    "\n",
-    "I tried to be disciplined in my approach and for the most part feel like I was successful.  I made deliberate attempts to make use of newly acquired tools and techniques like Git, VS Code, Jupyter notebooks, Markdown - ensuring that I commented on my code, made use of Markdown cells during the development phase, and committing changes to Git in appropriate intervals.\n",
+    "I used a modified crisp-dm approach, understanding the requirements, then cycling through iterations of pseudocode, Python code and testing until achieving the desired results.   I found it very effective, but also that I can occasionally go “off-piste” in the iterations, which can be time-consuming, frustrating and ultimately less productive. \n",
     "\n",
+    "![](2022-12-18-23-11-02.png)\n",
+    "    \n",
+    "I made conscious use of “new-to-me” tools and techniques like Git, VS_Code, Jupyter notebooks, Markdown.\n",
     "\n",
     "### <b> Code Evaluation </b>\n",
+    "Overall, I am pleased with my code - functions achieve the requirements (as interpreted) and they <i>feel</i> efficient and robust.  \n",
     "\n",
-    "When it comes to code evaluation, I think it is important to acknowledge that the Task_1 code has been developed to meet the coursework specification.  As such, it has been produced with certain limitations and constraints and does not necessarily reflect my approach in a different context, such as my employment or personal projects.  However, it is important to reflect on what has been produced.  \n",
-    "\n",
-    "Overall, I am pleased with the code that I have produced.  It achieves the requirements (as I have interpreted them) and it <i>feels</i> fairly efficient and robust.  \n",
+    "Principles in mind when writing functions:\n",
     "\n",
-    "I tried to be mindful of certain principles when undertaking this work:\n",
+    "* Future-proofed:  generic, flexible, adaptable to allow reusability\n",
+    "* User-friendly, by adding assertions and error-handling\n",
+    "* Unambiguous, self-explanatory naming of functions and variables\n",
+    "* Helpful comments/docstrings by balancing approaches like DRY (Don’t Repeat Yourself), WET (Write Everything Twice), KISS (Keep it Simple, Stupid)\n",
     "\n",
-    "* Future-proofing the functions, i.e. making them flexible and adaptable to different data types and formats, perhaps accepting different parameters, etc. \n",
-    "* Thinking about likely pitfalls from the users perspective and pre-empting likely errors with assertions and exception handling\n",
-    "* Unambiguous, self-explanatory naming of functions and variables, where possible\n",
-    "* Balanced comments and docstrings - considering approaches like DRY (Dont Repeat Yourself), WET (Write Everything Twice), KISS (Keep it Simple, Stupid) bearing in mind possible readers of the code\n",
+    "#### <b> Strengths </b> \n",
+    "* Well-commented, functioning code\n",
+    "* Consistent Git use for version control\n",
+    "* Kept working notes\n",
     "\n",
-    "I think I have been reasonably successful in achieving these goals.  I have tried to make the code as flexible as possible, accepting different parameters and data types, and I have tried to make the code as robust as possible, using assertions and exception handling to pre-empt likely errors.  I have saved alternate versions of the functions for future use and reference, which take additional parameters or employ a different approach. \n",
-    "\n",
-    "In terms of naming convention, documentation and comments, I think I struck the right balance.  If anything, I have probably over-commented slightly but this is because I am erring on the side of caution.  I have yet to settle on a naming convention and standard and find that I flipflop between different methods like camelCase, snake_case, PascalCase, etc.  I am also aware that there seem to be conventions for variables and functions, especially iterables, which I am not fully using. \n",
-    "\n",
-    "As for the code itself, I think it is fairly efficient and robust.  Of course, there is always room for improvement and I would very much like to know of any suggestions or improvements, or better approaches.\n",
+    "#### <b> Improvements / To-do </b>\n",
+    "* Perhaps over-commented; erred on side of caution\n",
+    "* Establish preferred naming convention – camelCase, snake_case\n",
+    "* Learn Python conventions \n",
+    "* Don’t get side-tracked when testing\n",
+    "\t* Update pseudo code\n",
+    "\t\n",
+    "[Archived reflective notes by task](archived\\Task1_FR_reflections.md)\n",
     "\n",
     "\n",
     "#### <b> Summary </b>\n",
+    "I found this task both appealing and beneficial.  It allowed me to build a useful function from the ground up, making use of different Python coding techniques and data structures whilst also employing version control and applying appropriate metadata to the code.\n",
+    "\n",
+    "I am super-keen to keep learning for my personal and professional development, picking up best practice, standard approaches and avoiding pitfalls.  This task allowed me to practice all of this.  \n",
     "\n",
-    "Whilst I have a background in elements of data and systems, I am only getting started in the programming part of this journey.  I am super keen to learn for my personal and professional development and want to pick up best practices, adopt standard approaches and avoid pitfalls and repeating mistakes.  When it comes to Python, I am amazed at how many different ways there are to solve a particular scenario.  This can make it impossible to identify a 'single best approach' - something I will need to get used to and embrace.  \n",
+    "When it comes to Python, I am amazed at the many possibilities of solving the same scenario – this can make it challenging to identify the ‘best approach,’ if it exists.  This is something I will need to get used to and embrace.\n",
     "\n",
     "\n"
    ]
diff --git a/archived/Task1_FR_reflections.md b/archived/Task1_FR_reflections.md
new file mode 100644
index 0000000000000000000000000000000000000000..0dccd5e76c1ce277651dfc96bc8e74367a38477c
--- /dev/null
+++ b/archived/Task1_FR_reflections.md
@@ -0,0 +1,89 @@
+### FR1: arithmetic mean
+
+* easy to implement - good confidence boost
+* allows for getting into a flow of coding, including comments, docstrings, naming conventions (variables and functions), testing, etc.
+* I kept returning to the naming and commenting of early requirements - as they developed.  
+* Challenge was getting the balance right - on the one hand, I wanted to ensure that I was including adequate comments but on the other hand, I did not want to over document.  Bearing in mind that this is an assessed element of coursework submitted for a master's module, I think I have the right balance.  In 'the real world,' I would probably shorten some of the docstrings and remove some in line comments.  
+* My tactic has been to try to name variables and functions in an unambiguous, meaningful manner so that documentation is unnecessary.
+* I haven't settled on a style yet - e.g. casing (proper, camel, etc.) and use of underscores.  
+* When writing functions, I wanted them to have reuse value - that is, to keep them general.  With some requirements, this was not possible - some of my ideas would have deviated from the brief.  Examples later.  
+* I wanted to use helpful python code where relevant, e.g. try/except blocks with specific (anticipated) errors captured.  
+
+* WHat I would do differently:
+  * I need to be mindful of using restricted/reserved words as variable names.  I have used 'list' in FR1 function.  This is not a problem in this instance but it could be in others.  I will need to be more careful in future.
+  * I also would like to assess the efficiency of my functions - there are so many ways to write functions producing the same outcomes in python.  Absolutely amazed at how flexible and powerful it is - but at the moment, I am pleased when I can get my code to work.  Efficiency, elegance, etc. will come with time and experience - but I could start using time to assess processing times for different approaches. 
+
+### FR2: read single column from CSV file
+
+* Again, I wanted to make a reusable function, so I added 'delimiter' as an optional parameter.  I prefer to use pipes (|) as delimiters as they are very seldom used in text.  This is something I have found handy in data migration work I have undertaken in my current job.  Quite often - common characters like commas and 'invisible' spacing characters like tabs, new line, carriage returns, etc. can wreak havoc when migrating data.  
+* I liked this function as it was a good opportunity to use the 'with' statement.  I like that it 'closes' the file when finished, without a need for an explicit close statement.  
+* The challenge for me in this requirement was working out how to extract the header row and then proceed to the data rows.  
+* It was a good example of where I knew what I wanted to do and it seemed simple enough, but took some trial and error before I got there (plenty more to come!)
+* Also added a specific FileNotFoundError exception as I expect this to be the most common error.
+* I also added a if/else statement to account for whether the file has a header - returning results accordingly.  While this is not in the brief, I was thinking about a future proofed function which can handle both varieties.
+
+### FR3: read CSV data from a file into memory
+
+* Much of the above applies - the function docstring is probably a bit long but I wanted to be clear and complete in my documentation.
+* This function builds on FR2 - but returns a dictionary by iterating through the data rows of the file.  
+* I worked through many versions of this - for loops, enumerate, etc. - but settled for a list comprehension, once I get it working the way it should.  I am not sure if it is the most efficient way to do it, but it works.  However, I like the conciseness of list comprehensions - although they may not be as readable as a for loop.
+* In terms of variable names, I am not always clear on the standards - i, j, row, line, etc. and I vary between them - which annoys me.  Consistency, clarity, transparency, etc. are very important to me. I am not so happy with 'variable_data_dict' as a name, but couldn't find anything better - 'my_dict', 'data_dict', etc. - I am critical of them all. 
+
+
+* one problem I had with FR3 was discovered when using the function in various of FR4.  The original dictionary produced string key-value pairs - which when printing looked fine to me.  It took me some time to realise that the values were string - that they were enclosed in single quotation marks.  This was certainly a source of frustration - when something simply was not behaving the way I was anticipating.
+  * However, once I noticed, I tried to amend the list comprehension by convernting to integers with int() - which caused an error - "invalid literal for int() with base 10: '60.5'"
+  * Therefore I converted the data to floats - which may not be quite right for ages, but will not impact the calculations.
+* 
+
+### FR4: Pearson Correlation Coefficient
+
+* I spent a lot of time on this function and tried to make it work using several methodologies.  
+* Pseudo code was particularly helpful in this instance, but I have a tendency of starting with pseudo code and then deviating into a confused space of trial and error rather than updating the pseudo code.  Definitely something I can work on and take away into my current day job. 
+* Firstly I needed to remind myself how to calculate Pearson's r having only have had it calculated for me in SPSS or R.  There a couple of approaches - but the strategy is to break down the formula into standalone calculations and build the function to produce these before putting it all together to return R.  
+* I spent a lot of time trying different approaches.  Again, like last time I started with for loops and enumerate() but found that I couldn't always achieve my intentions - that I would get slightly lost in the hierarchy and multilined approach, so I used list comprehensions instead and in the end, I am pleased with the result.
+
+* As previously, I tried thinking about generalising and future-proofing the function for future use and that includes considering what tests / checks / validation I could put in place.  The most obvious source of error will be the csv file so I added some assertions to ensure that the csv file has the correct data before proceeding to calculate Pearson's r.  
+* I tested my version of Pearsons with numpy corrcoef() and it produces the same results for a small data set, which gives you that excitement of having achieved your intentions.
+
+
+### FR5: Pearson Correlation Coefficients for a file
+
+* This function builds on FR4 by calculating Pearson's r for each pair of columns in a csv file.
+* I enjoyed using FR3 and FR4 (which in turn use FR1) within this function.  That is pretty cool.
+* It does not have many lines of code but it took me a while to get it working.  I had to think about how to iterate through the columns of the csv in terms of what has been returned by FR3 and FR4 and then getting the column names into a list of tuples.
+* It was not plain sailing and there was a fair bit of frustration - another good example of where I knew exactly what I wanted to do but it was hard to get there.  I had a few versions on the go - I think I could have benefited by sticking with one and updating the pseudo code.  
+* As previously, I moved from for loops to list comprehensions as I had better success getting the results I wanted, especially returning lists of tuples.  Initially, I was returning individual lists instead of lists of tuples.  
+* The joy of success after some struggle can be very satisfying. 
+
+I would probably look to round some of the output in future iterations.
+Actually - I just added the rounding now!
+
+And finally
+
+### FR6: printing the results
+
+* I found this FR very challenging and a bit frustrating.
+* I could have benefitted significantly by sticking with my pseudo code and keeping it up to date - but once I got stuck into a cycle of trial and error, minor success and subsequent set-back, I did not rever to pseudocode and focused on cracking the issue which was hampering me.  
+* This function was created through trial and error - entirely
+* I broke it down into separate sections and tried to think how I would iterate through the list of tuples to get a decent end result.
+* Some sections worked, others didn't, changes then impacted the rest of the function and I was back to square one - hardforking!
+* Towards the end, I had functioning portions - the top table header, the column headers, the rows headers, the r-values.  This was a mess of for loops, list comprehensions, and string formatting which I needed to work through step by step.
+* I found cells in Jupyter notebooks to be invaluable - the ability to run portions of code and easily move it around is very effective and powerul.  I also found the ability to comment out sections of code very useful - I could easily comment out sections of code and then uncomment them to see how they impacted the rest of the function.
+* Finally, the pieces (rows of code) were lining up - at this point, I resorted to tweaking and assessing the impact of each change.  I discovered that I needed data specific to the input table - rather than fixed widths - so created a separate function to calculate max column widths and then played with that to get the table to look passable.  
+
+Whew - got there in the end!  
+
+Although this was more challenging to get motivated over as I am sure I would use some library to print tables without the effort and challenge - i.e. pandas, prettytable, etc.  However, in hindsight I am glad I persevered as I have learned a lot about string formatting and how to iterate through lists of tuples.  It allowed me to trouble-shoot, iterate through a requirement, reassess and rework through many cycles.  I am sure I will use this knowledge in the future.
+
+### general
+
+* pleasure of the process; also frustrating
+* process - crisp-dm - iterative
+* related to work - > importance of requirements and flexing, agility
+* pseudo code - getting into habit has been very useful (have a tendency, which I need to fight, to dive into the deep end and just manage to keep afloat)
+* getting into git (hub, lab) - but have been frustrated (authentication with gitlab - needing to switch to https from ssh, despite seemingly set up correctly)
+* single powerful tool, work environment - vs code - revolutionary
+* documentation - robust, standardised, conventions - i.e. commenting
+* intersections of domain, maths, coding
+* not enough time
+