Keys

Keys are like special labels that help us compare rows between two tables. They consist of one or more columns in a table that, when specified, uniquely identify each row. Let's break it down with an example:

Example Tables

Tested Frame

ID	Another ID	Yet Another ID	Name	Age
1	101	201	John	20
2	102	202	Mary	21
2	103	203	Jane	22
3	103	203	Jack	23
3	104	205	Jill	24

Expected Frame

ID	Another ID	Yet Another ID	Name	Age
3	104	205	Jill	34
2	103	203	Jane	32
2	102	202	Mary	31
1	101	201	John	28
3	103	203	Jack	33

Using Keys for Comparison

If keys are specified (e.g., ID, Another ID, and Yet Another ID), rows with matching key values are compared. For instance, the first row in the expected frame is compared with the last row in the tested frame.

No Specified Key

If no key is specified, rows are compared based on their position (row number) in the tables. The first row in the first table is compared with the first row in the second table, and so on.

Non-Unique Keys

If keys are not unique (like having the same combination of ID, Another ID, and Yet Another ID), the rows are sorted based on the specified keys before comparison. Under the hood, tables are rearranged for fair comparison.

Different Datatypes in Keys

If key columns have different datatypes, they are converted to the same type before comparison. If the conversion fails, an error is thrown.

Keys help us make sense of data by ensuring we're comparing the right things between tables.

Levenshtein Distance Boolean comparisons