Generating Reports#

Created on Thu Jan 4 14:00:00 2024

Copyright 2024 Roy Ruddle

Licensed under the Apache License, Version 2.0 (the “License”); you may not use this file except in compliance with the License. You may obtain a copy of the License at

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an “AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

class vizdataquality.report.Report#

This class allows users to write a report while they investigate data quality and profiling a dataset. Reports may have headings, text, figures and tables. The overall structure may follow the six steps we suggest or be freeform. Reports may be output as a webpage, in Latex or in a text file.

add_acknowledgements(text=None, key=None)#

Add the supplied acknowledgements to the report.

Parameters:
  • text (string) – Text to add to the report. The default is None.

  • key (string) – User-defined name of this report item. The default is None.

Returns:

The key used for the acknowledgements in the report dictionary.

Return type:

int

add_descriptive_stats(df, text=None, caption=None, key=None)#

Add descriptive statistics (e.g., calculated by calc() in calculate.py) to the report.

Parameters:
  • df (dataframe) – The descriptive statistics to be added. The index is the variable names.

  • text (string) – Text to add to the report. The default is None.

  • caption (string) – A caption for the table. The default is None.

  • key (string) – User-defined name of this report item. The default is None.

Returns:

The key used for the table in the report dictionary.

Return type:

int

add_figure(filename, text=None, caption=None, key=None)#

Add the supplied figure to the report.

Parameters:
  • filename (filename) – The file containing the figure.

  • text (string) – Text to add to the report. The default is None.

  • caption (string) – A caption for the table. The default is None.

  • key (string) – User-defined name of this report item. The default is None.

Returns:

The key used for the figure in the report dictionary.

Return type:

int

add_heading(heading, level=1, text=None, key=None)#

Add the supplied heading to the report.

Parameters:
  • heading (string) – The heading.

  • level (int) – 1 - n. The default is 1.

  • text (string) – Text to add to the report. The default is None.

  • key (string) – User-defined name of this report item. The default is None.

Returns:

The key used for the heading in the report dictionary.

Return type:

int

add_table(df=None, index=False, filename=None, text=None, caption=None, key=None)#

Add the supplied dataframe to the report.

Parameters:
  • df (dataframe) – The table to be added. The default is None.

  • index (boolean) – Whether or not to output the dataframe index in the report. The default is False.

  • filename (filename) – The file containing the table. The default is None.

  • text (string) – Text to add to the report. The default is None.

  • caption (string) – A caption for the table. The default is None.

  • key (string) – User-defined name of this report item. The default is None.

Returns:

The key used for the table in the report dictionary.

Return type:

int

add_title(heading, key=None)#

Add the supplied title to the report.

Parameters:
  • title (string) – The title.

  • key (string) – User-defined name of this report item. The default is None.

Returns:

The key used for the title in the report dictionary.

Return type:

int

dataset_size(name, num_rows, num_cols, text=None, caption=None, key=None)#

Add a table summarising the size of a dataset to the report.

Parameters:
  • name (string) – The name of the dataset

  • num_rows (int) – Number of rows in the dataset

  • num_cols (int) – Number of columns in the dataset

  • text (string) – Text to add to the report. The default is None.

  • caption (string) – A caption for the table. The default is None.

  • key (string) – User-defined name of this report item. The default is None.

Returns:

The key used for the table in the report dictionary.

Return type:

int

get_report_dict()#

Get the content of the report.

Parameters:

None.

Returns:

The report items in a dictionary.

Return type:

dict

paragraph(text, key=None)#

Add a paragraph of text to the report.

Parameters:
  • text (string) – The text of the paragraph.

  • key (string) – User-defined name of this report item. The default is None.

Returns:

The key used for the paragraph in the report dictionary.

Return type:

int

save(filename, overwrite=True, table_kw={}, **kwargs)#

Save the report in a text format file.

Parameters:
  • filename (TYPE, optional) – DESCRIPTION. The default is filename.

  • overwrite (TYPE, optional) – DESCRIPTION. The default is True.

  • table_kw (dictionary) – Keyword arguments for pd.DataFrame.to_html() or to_latex(). Default is an empty dictionary.

  • **kwargs (dictionary) – Keyword arguments for open()

Return type:

None.

step1(text=None, key=None)#

Add the Step 1 heading to the report.

Parameters:
  • text (string) – Text to add to the report. The default is None.

  • key (string) – User-defined name of this report item. The default is None.

Returns:

The key used for the step in the report dictionary.

Return type:

int

step2(text=None, key=None)#

Add the Step 2 heading to the report.

Parameters:
  • text (string) – Text to add to the report. The default is None.

  • key (string) – User-defined name of this report item. The default is None.

Returns:

The key used for the step in the report dictionary.

Return type:

int

step3(text=None, key=None)#

Add the Step 3 heading to the report.

Parameters:
  • text (string) – Text to add to the report. The default is None.

  • key (string) – User-defined name of this report item. The default is None.

Returns:

The key used for the step in the report dictionary.

Return type:

int

step4(text=None, key=None)#

Add the Step 1 heading to the report.

Parameters:
  • text (string) – Text to add to the report. The default is None.

  • key (string) – User-defined name of this report item. The default is None.

Returns:

The key used for the step in the report dictionary.

Return type:

int

step5(text=None, key=None)#

Add the Step 5 heading to the report.

Parameters:
  • text (string) – Text to add to the report. The default is None.

  • key (string) – User-defined name of this report item. The default is None.

Returns:

The key used for the step in the report dictionary.

Return type:

int

step6(text=None, key=None)#

Add the Step 6 heading to the report.

Parameters:
  • text (string) – Text to add to the report. The default is None.

  • key (string) – User-defined name of this report item. The default is None.

Returns:

The key used for the step in the report dictionary.

Return type:

int