Color maps in matplotlib

To create a scatter chart that uses a selected color palette, enter the name of the selected color scheme as the cmap argument.

The colors are grouped (see documentation). For instance: Sequential – different intensity of one color, Diverging – different intensity of two contrasting colors, Qualitative – different contrasting colors, but matching the palette, e.g Pastel1 – containing different pastel colors.

import matplotlib.pyplot as plt

fig, ax = plt.subplots()

ax.set_title('Chart title')
ax.set_xlabel('X axis label')
ax.set_ylabel('Y axis label')

# x - list with X axis values, e.g. product name
# y - list with Y axis values, e.g. product price
# intensivity - a list with numbers corresponding to the intensity of the attribute
# cm - name of the selected color palette

ax.scatter(x, y, c=intensivity,  cmap=cm)  
  # e.g: plt.scatter(x, y, c=intensivity, s=50,  cmap='plasma')

mappable = ax.collections[0]
cbar = fig.colorbar(mappable=mappable)
cbar.set_label('intensivity')

plt.show()

The s parameter is the size of drawn characters.

To display a colorbar that shows the intensity of a given feature for the displayed point, the mappable object should be specified as a parameter, which for the scatter plot is stored as an element of the AxesSubplot collections list.

As a result, on the chart, which may show, for example, the name of the product and the price, you can add additional information using the color, e.g. popularity among buyers or the quantity of goods in the warehouse (e.g. green – the product is available without any problems, up to red – no goods in stock) )

DataFrame object in Pandas based on data from pdf file

Reading data from a pdf file requires the tabula-py module to be installed. This module also enables saving the read data to a data file in csv or json format.

import tabula
df_list = tabula.read_pdf('file.pdf')

The read_pdf function reads one page from a pdf file by default, if no value is given for the pages parameter (if you want to load all pages: pages = ‘all’).

The above function returns a list object containing successive DataFrame objects, for example:

df = df_list[0]        # first DataFrame object

Loading data from csv file to numpy.ndarray

When there are no missing values ​​in the source data, we can use the numpy.loadtxt() function.

However, if there is no value in the loaded file, instead of the above function, we can use the numpy.genfromtxt() function, i.e.

import os
import numpy as np
script_dir = os.path.dirname(__file__)
path_to_file = os.path.join(script_dir, 'data_file.csv')

data_array = np.genfromtxt(path_to_file, dtype='str')

The genfromtxt() function returns an object of type numpy.ndarray. As additional function parameters, we can add e.g.

  • delimiter - determines which sign separates particular values
  • skip_header - specifies how many lines from the beginning of the file are to be skipped
  • autostrip - a bool parameter specifying whether spaces should be automatically removed

A list of all parameters can be found here.

Running the program from a script

To run another program in Python, we can use the system () function from the os module. The problem is when we want to read the result of the running program, i.e.

import os
result = os.system('df -h')
print(result)

The above program will display the result of the df program in the console, but the value of the result variable is the error code of the os.system () function execution.

If we want the result variable to store the result returned by the program, then instead of the system () function we should use the popen () function or its newer equivalent – the Popen class from the subprocess module. ie.

# 1 first option - function popen()
import os
result = os.popen('df -h').read()
print(result)

# 2 second option - Popen class
import subprocess
command = subprocess.Popen('df -h', shell=True, stdout=subprocess.PIPE)
result = command.stdout.read().decode('utf-8')
print(result)

Parsing the CSV file

CSV files are text files in which each line represents one data record, and the individual data in the line is separated by a delimiter, usually a comma.

In the example below, we are parsing a refueling report from a gas station. The first line is the header and contains the data: Contractor’s data; Name; Surname; Correction number; WZ number; Date; Time; Counter; Station; Registration number; Card number; Product name; VAT percentage; Price at the station; Net price; Gross price; Discount value; Quantity; Net; VAT; Gross.

In this particular case, the delimiter is the semicolon character. The following lines will contain entries about the next refueling. We want to obtain from the source file data on the date of refueling, the registration number of the car and the number of liters of fuel taken.

import csv

with open('report.csv') as csv_file:
    csv_reader = csv.DictReader(csv_file, delimiter=';')
    total = 0
    for line in csv_reader:
        print('{}  {}  {} ltr'.format(
            line['Data'], line['Registration number'], line['Quantity']))

        total += float(line['Quantity'])
    print('Total: ', total, 'ltr')

    with open('new-report.csv', 'w') as new_csv_file:
        field_names = ['Date', 'Auto', 'Refueling']
        csv_writer = csv.DictWriter(
            new_csv_file, fieldnames=field_names, delimiter=';')
        csv_writer.writeheader()
        csv_file.seek(0)
        next(csv_reader)
        for line in csv_reader:
            dict = {}
            dict['Date'] = line['Date']
            dict['Auto'] = line['Registration number']
            dict['Refueling] = line['Quantity']
            csv_writer.writerow(dict)

We perform parsing using the csv module. Then, using the context manager, open the report.csv file for reading. We use the DictReader object for reading, thanks to which it will be possible to refer to the value by specifying the keys from the csv file header.
Then the total value is calculated – the total amount of fuel taken.

We save the obtained data on refueling in the new-report.csv file. In this case, we’re using a DictWriter object. To use the iterator again, set the file content pointer to the beginning of the file – csv_file.seek (0). We replace the default headers with the new field_names contained in the list, so it is necessary to jump to the new iterator value by executing next (csv_reader). We save the new line in the file with the writerow () method of the csv.DictWriter object.

PySimpleGUI – faster GUI creation

PySimpleGUI is a wrapper that facilitates and speeds up the creation of Python window applications. There are 4 ports, based on the following libraries: tkinter, Qt, WxPython, Remi.

Changing the port eg. from PySimpleGUI (based on tkinter) to PySimpleGUIQt (using Qt) does not require any further code changes! More details can be found >>here<<.

A sample program that collects data from the user and displays the data in a second window:

import PySimpleGUIQt as sg

layout = [[sg.Text('Please enter some sample text')],
          [sg.InputText()],
          [sg.Submit('Apply'), sg.Cancel('Cancel')]]

window = sg.Window('Data source window', layout)

event, values = window.Read()

window.Close()

if event == 'Apply':
    text_input = values[0]
    sg.Popup('Text entered:', text_input, title='Data display window')

The first line of the script imports the wrapper module (it must be installed in the system e.g. via pip, as well as the framework that is used by the port – in this case Qt).

The next line defines the layout, which is a list of lists. Each subsequent list defines the next line of the template – the Text element on the first line that writes ‘Please enter some sample text’ , the InputText field in the second, and the Apply and Cancel buttons on the third line.

Then we create a window and assign the previously created layout as the second argument. Pressing any button or closing the window will switch to the window status reading – a tuple consisting of: the event element (e.g. button name or None when the window closing button is pressed) and a dictionary containing keys describing the window input fields and the values ​​entered in these fields is returned.

When the key argument is not given for an input field when creating a layout, by default the dictionary keys are consecutive integer values. In the above example, the text entered, which is saved in the values ​​dictionary as the key value 0, is passed as an element of the string displayed in the popup window (Window displaying data).

Window applications in Python and PyQt5 – using UI files

In Python, we can use .ui files describing the appearance of the interface generated in the Qt Designer tool. To run a window application that uses the xml ui file containing the interface appearance and uses the Qt library, install the PyQt5 overlay, i.e.

pip install PyQt5

Then we can use the following code which loads the ui file and sets the title of the application window.

import sys
from PyQt5.QtWidgets import QApplication, QMainWindow
from PyQt5 import uic

class MyApp(QMainWindow):
    def __init__(self):
        QMainWindow.__init__(self)
        QMainWindow.setWindowTitle(self, 'App Window Title')
        uic.loadUi('app.ui', self)

if __name__ == "__main__":
    app = QApplication(sys.argv)
    window = MyApp()
    window.show()
    sys.exit(app.exec_())

The above code does nothing, except to display a window with selected buttons, labels, etc. In order for these elements to respond to e.g. a click, you must define signals and slots for the GUI element e.g. for a button named calculateButton we create a method which will be the slot corresponding to the clicked signal. A list of signals that a given element can respond to can be found in the Qt library documentation. We place the assignment of the method responding to the signal in the init () method, i.e.

self.calculateButton.clicked.connect(self.my_calculate_function)

and then define the method that will be run when the button is pressed (in this case, the pass statement):

def my_calculate_function(self):
        pass