Biometrics Field Trial Evaluation Report
Section 6. Biometric System Performance
6.1 Background
This section describes the findings on the performance of the biometric systems during the field trial. It covers the quality of the fingerprints and photos collected and the matching performance of both biometric systems. The performance indicators are listed in Appendix D: Evaluation Methodology.
The field trial involved collecting, for the purpose of enrolling, 10 fingerprints and one photo every time a client applied for a visa at the participating visa offices and then, when the client arrived at a participating port of entry to Canada, collecting one fingerprint for the purpose of verification, to see if the person who travelled to Canada was in fact the same person who received the visa.
All matching was conducted at the Headquarters Matching Centre (HQC). Matching involved searching the digitized photos and fingerprint images against the field trial database for all 18,264 client enrolments, which consisted of 14,854 temporary resident visa applications and 3,410 refugee protection claims.
6.1.1 Types of automated biometric matching
Three types of biometric matching were conducted during the field trial:
- One‑to‑many matching of all photos of clients applying. Photos enrolled as part of the field trial were compared with each other to identify duplicates and to detect possible fraudulent attempts. Of the 18,264 client enrolments, 41 had no photo associated with them as a result of operator error, and two (2) failed to enrol. This matching process therefore involved comparing all 18,221 photos with each other. The results of these 332,004,841 individual facial recognition matches are presented in Section 6.2.
- One‑to‑many matching of 10 fingerprints from all 11,623 [note 6] sets of fingerprints enrolled. The breakdown is provided in Figure 6‑A. This process helped determine the number of duplicate attempts made either legitimately, by clients making multiple applications to obtain a visa, or fraudulently. This matching process involved comparing all 11,623 sets of up to 10 fingerprints with each other. The results of these potentially 135,094,129 individual fingerprint matches are presented in Section 6.3.
Figure 6‑A: Initial fingerprint enrolments

- One‑to‑one matching of a single fingerprint presented at the port of entry for comparison against that client’s 10 originally enrolled fingerprints. The process matched 918 [note 7] single fingerprints presented at a port of entry by visa clients to the corresponding file number look‑up of the enrolled 10‑fingerprint set.
Where a comparison yielded one or more possible matches—when the matches had a biometric similarity score above CIC’s defined threshold—the forensic specialists evaluated the suggested matches and either accepted or rejected them. Under the rules of the field trial, in neither the reject or accept scenarios was the resultant data provided to any of the participating ports of entry or visa offices, or to the Refugee Intake Centre.
6.1.2 Forensic specialist review of suggested automated biometric matching
The forensic specialists who reviewed the suggested matches used screens similar to those shown below.
Matches could either be performed on an individual query basis or in batches. The user performing the matching at the HQC could choose whether to match based on face only, on fingerprint only, or on a combination of both. When matching based on a combination, the user would specify whether the match results were to be sorted by face primary or by fingerprint primary.
Figure 6‑B shows a sample Level 1 match review screen. [note 8]
For this match, an individual probe record was matched by face primary. All facial recognition scores are above 72.25, the threshold that the forensic specialists set for the field trial as reflecting the best balance between correct matches and false rejects. The screen also shows the fingerprint scores, the name and the date of birth (if available). Having both scores together proved highly useful for analysis.
Figure 6‑B: Level 1 match review screen
Source: Demo records
*Probe—A biometric template that is used to search against a database(s)
Figure 6‑C shows a sample Level 2 review screen. This screen was used to view a selected matching result image—photo, thumb and index fingerprint—side‑by‑side with the probe (the original photo) being searched against the database.
Figure 6‑C: Level 2 match review screen
Source: Demo record
Figure 6‑D shows the Level 3 screen for a fingerprint. This screen was used to enlarge the probe and result images side‑by‑side.
Figure 6‑D: Level 3 match review screen

Source: Demo record
6.2 Facial recognition
6.2.1 Facial recognition enrolment performance
For the total 18,264 enrolments, 18,223 had photos (41 records had no photos). Of those, 14,816 photos were from visa applicants, whose photos were scanned into the system at the visa offices at 300 dots per inch, and 3,407 [note 9] were from refugee protection claimants, whose photos were taken using LiveScan at the Refugee Intake Centre in Toronto.
Enrolment times for photo collection included scanning and cropping, which took approximately 10 seconds, plus approximately 30 seconds for the photo to be saved to the server. This included time to create a highly compressed 3.25 KB photo, for which the purpose was writing it to a chip, and available for subsequent query display.
Failure to enrol photos
The field trial system attempted to enrol each scanned photo and gave each photo a quality score from 1 to 100. If a photo could not be enrolled, the system would give the photo a score of zero and move it to the “Failed to Enrol” section of the database.
Of the 18,223 photos for enrolment, only two (2) failed to enrol, for a total of 18,221 successfully enrolled photos for matching purposes. The first case was a photo of a child taken too close to the camera and posed at a 30‑degree angle. In the second case, the facial recognition software could not enrol the image because the individual had an eye injury.
Photo quality
Photo quality was examined from three perspectives:
- The system‑generated photo enrolment score
- Compliance of selected samples with International Civil Aviation Organization (ICAO) standards, as analysed by forensic specialists at three different periods of the field trial versus before the start of the field trial
- Compliance of all photos clients who were verified at a port of entry with ICAO standards, as analysed by forensic specialists
Figure 6‑E shows that the biometric system found scanned visa photos to be of higher quality than refugee claimant photos for facial recognition. [note 10]
Figure 6‑E: System‑generated photo quality scores
Figure 6-F shows that while the system‑generated scores for visa applicants were higher than those for refugee claimants, there was little difference between genders.
Figure 6‑F: System‑generated scores for visa applicants and for refugee protection claimants by gender
Figure 6-G indicates the compliance level as determined from a review of 300 sample photos from visa offices before the field trial. The quality of photos prior to the Trial, in terms of compliance with ICAO standards, was quite low. Hence, the implementation team enhanced the photo specification training tools and guidelines prior to launching the Trial.
Figure 6‑G: Visa office compliance with ICAO standards—Pre field trial

Definitions
- Level 1: Meets all ICAO specifications
- Level 2: One or two ICAO minor violation(s)
- Level 3: Major ICAO violation(s)
Figure 6‑H indicates the compliance level as determined from a review of 600 sample photos throughout the field trial. The quality of field trial photos in terms of compliance with ICAO standards improved greatly during the Trial and became quite high.
Figure 6‑H: Compliance of photos with ICAOO standards—During field trial

Figure 6‑I shows the results of the forensic specialist assessment of a sample of photos from clients who were verified at a port of entry during the field trial.
Figure 6-I: Forensic specialist assessment—Photos of clients with a fingerprint verification record on file

Source: Verifications Evaluation by Forensic Specialists
Quality definitions
Level 1 Photos accepted
Level 2 Photos accepted but with a slight ICAO violation
Level 3 Rejected = Several ICAO violations
Many aspects of the facial recognition software were analysed for accuracy. Approximately 80% of the photos were examined for quality. The very few problems detected occurred when CIC photo specifications had not been followed. This could likely be eliminated with experience and proper guidance.
The problems encountered included the following:
- Lighting: Some of the photos had either too much or too little lighting, causing them to be too dark or too washed‑out for all of the features to be seen.
- Heads turned left, right, up or down: The turning of the head caused significant problems for the search engine, especially when the head was turned up or down.
- Eyeglasses: Eyeglass frames that cut across the top of the eyes and glare in the lens created problems for the search engine.
- Hair: Hair obstructing facial features also caused problems.
- Small photos: Some photos were simply too small, which made viewing difficult.
6.2.2 Facial recognition matching performance—Alone and in combination with fingerprints
There were 18,221 successful enrolments in the field trial. Matching pairs of photos were only found when individuals applied more than once for a visa or applied for a visa and then made a refugee protection claim.
Based on name searches, queries of existing CIC systems and extensive biometric match reviews, it was determined that 364 individuals interacted with CIC at least twice during the field trial and therefore provided at least two photos. Some of these individuals applied three or four times during the trial.
Those situations enabled CIC to analyse the matching performance of the biometric system. The analysis showed that possible matches totalled 394. The 30 pairs above the 364 individuals who interacted with CIC at least twice during the field trial results from instances of individuals applying three or four times.
Of the 394 pairs (Figure 6‑J):
- 195 times CIC had both photos and fingerprints to use for matching
- 182 times CIC had only photos to use for matching
- 17 times CIC had only fingerprints to use for matching (due to operational errors)
Figure 6‑J: Breakdown of biometric matching

The suggested matches in the hundreds of suggested matches would often follow a trend. For example, if the person had long hair, then most of the suggested matches would have long hair. If the person had glasses or head gear, then most of the suggested matches would have glasses or head gear.
However, this trend did not seem to pose a major problem. If the face of the person being compared was actually in the database, the system would find the correct match.
When the CIC photo specifications were followed, the facial recognition software proved to be an invaluable tool, successfully matching faces in a database of thousands, which a human being could never have accomplished in the same amount of time.
6.2.3 Correct identification matching for photo‑only
Of the 394 possible matches, 182 pairs of possible matches were found using facial recognition only, since these individuals did not submit two or more sets of fingerprints. In three cases, the facial recognition score was above the threshold set by the forensic specialists, so the system generated a false non‑match count of 3 (a correct match percentage of 98.1%).
Of the 394 possible matches, there were 195 potential matches for whom it was possible to match using a combination of both facial recognition and fingerprints. Examining only the performance of the facial recognition system, the facial recognition score of the correct match was above the threshold set by the forensic specialists in 183 cases (93.8%).
In total, CIC had 377 potential matches using facial recognition. CIC’s correct match count was 362 (96.0%). Of these correct matches, 98.8% were the top‑ranked photo.
6.2.4 Correct identification matching for photo combined with fingerprints
Of the 394 possible matches, 195 potential matches were found using a combination of facial recognition and fingerprints.
Examining the performance of both biometrics combined on this set of potential matches, CIC found the following:
- Using either facial recognition scores or fingerprint matching scores above their respective thresholds, all 100% of matches were correct.
- Using both facial recognition scores and fingerprint matching scores and identifying a match only if both biometric results were above their respective thresholds, only 179 matches were made (91.8%).
- For the matches identified using both fingerprints and photos, CIC found that either biometric was adequate to confirm a positive match. However, using facial recognition or fingerprint recognition alone failed to identify two different pairs above the recommended threshold. The two false non‑matches of facial recognition were not for the same people as the false non‑matches for fingerprints. When combined, both biometrics yielded all matches, as opposed to only 155 (98.7%) if either biometric was used alone.
6.3 Fingerprint recognition
6.3.1 Fingerprint enrolment performance
Section 8 describes the enrolment times for all 10 fingerprints (the 4 + 4 + 2 slaps) and for the single fingerprint collected for verification. The remainder of this section deals with the quality of both types of fingerprints.
Fingerprint quality was examined from the following perspectives:
- The forensic specialists’ assessment of the 10‑fingerprint enrolment after reviewing approximately 3,000 samples taken during the field trial
- The system‑generated fingerprint enrolment scores for all 10‑fingerprint enrolments
- The forensic specialists’ assessment of the 918 verification fingerprints and their 10‑fingerprint enrolments contrasting with the system‑generated scores
- Contrasting visa applicant versus refugee protection claimant template scores – A template is the biometric system‑generated data used to match individuals to each other. Scores for template quality rather than scores for image quality were required, because records imported for refugee protection claimants did not include scores for image quality
6.3.2 10‑Fingerprint enrolment quality assessment by forensic specialist
After conducting an initial assessment of the fingerprint quality by reviewing fingerprint images, several issues were identified. A review of fingerprints that were not enrolled by the fingerprint algorithm found that approximately 3,000 suitable impressions had not been enrolled. [note 11]
Figure 6-K shows examples of fingerprint impressions that were not successfully enrolled into the biometric system and were instead set aside in a “failed to enrol” file as images.
Figure 6-K: Fingerprints that were not enrolled

Initially, several high‑quality images were not enrolled into the biometric system, while several poor‑quality images were. See Figure 6‑L for examples of poor quality images that were enrolled.
Figure 6-L: Fingerprints that should not have been enrolled

After the concerns were raised with the vendor, a new biometric algorithm was included in the software package and the issue was resolved.
A second problem identified was ghosting—a different impression being included with the captured images for some fingerprint impressions. Often, the ghost image was of better quality than the actual impression. This problem seemed to occur in consecutive batches. The problem originated during the initial calibration process of the LS2 fingerprint capture devices. If a hand or fingers were present on the glass plate during the initialization of the device, the image of the hand or fingers would be included with each fingerprint impression taken on that device. This problem was rectified by issuing a communiqué requesting staff to ensure no prints were on the reader during its initialization process.
See Figure 6‑M for examples of the ghosting problem identified.
Figure 6‑M: Ghosting

Another problem encountered was cropped images, in which a portion of the fingerprint image was cut off. This problem occurred because the fingerprint slaps were taken outside of the acceptable scan area and resulted in only part of some fingers being recorded. This issue was caused by a combination of operator error and software. The images came from visa offices and port of entry immigration secondary environments. The Headquarters Matching Centre encountered these cropped fingerprint images primarily in the single fingerprints collected for verification purposes. Because the displayed acceptable scan area does not precisely match the actual acceptable scan area, operators may not have known that the images were not being correctly captured. However, some images were so cropped that it seemed that the operator did not ensure that the client’s fingerprints were placed in the correct area. (Note: Because of the operational and facility‑related constraints, operators could not always see where clients had placed their fingers.) See Figure 6‑N for an example of cropped images.
Figure 6‑N: Cropped fingerprints Figure 6‑O: Cut‑off fingerprints

The overall quality (85% to 90%) of the fingerprint impressions was excellent. Any poor impressions usually stemmed from the subject’s age or from other factors. The poor‑quality images did not result from the equipment or the operator but from the client’s fingerprints having insufficient ridge detail to be captured.
The forensic specialist’s assessment of quality was based on the analysis of the overall fingerprints for clarity—how clearly the friction ridge detail is transferred from a three‑dimensional object (skin) to a two‑dimensional object (glass platen).
When evaluating a fingerprint, the following three levels of detail are looked at (a standard for fingerprint specialists around the world).
Level 1 detail refers to the overall pattern shape of the unknown fingerprint—a whorl, loop or some other pattern. This level of detail cannot be used to individualize, but it can help narrow down the search.

Level 2 detail refers to specific friction ridge paths—overall flow of the friction ridges and major ridge path deviations (ridge characteristics) like ridge endings, lakes, islands, bifurcations, scars, incipient ridges, and flexion creases.

Level 3 detail refers to the intrinsic detail present in a developed fingerprint—pores, ridge units, edge detail, scars etc.

6.3.3 System‑generated fingerprint enrolment evaluation
The biometric system’s auto‑generated fingerprint enrolment scores for all of the field trial’s 10‑fingerprints enrolments are presented below. Figures 6‑P to 6‑Q show that the biometric system gave fingerprints from both participating missions about the same average score.
Figure 6-P: Seattle Fingerprint Scores by Month

Figure 6‑Q: Hong Kong Fingerprint Scores by Month

Figure 6‑R: Breakdown of Automated 10-Fingerprint Quality Scores

Figure 6‑S: Fingerprint Quality Score by Gender

Figure 6‑T: Fingerprint Quality Score by Age

6.3.4 Forensic specialist assessment of verification fingerprint enrolment
Forensic specialists examined the images enrolled in the biometric system and their quality scores. Tables 6‑A and 6‑B show the system‑generated scores, along with the forensic specialists’ assessment of how the fingerprints would be judged using forensic specialists’ standard definitions.
Table 6‑A
| System score range | # | % |
|---|---|---|
| Range 1: 90‑100 | 29 | 3% |
| Range 2: 80‑89 | 274 | 30% |
| Range 3: 70‑79 | 314 | 34% |
| Range 4: 60‑69 | 251 | 27% |
| Range 5: 50‑59 | 28 | 3% |
| Range 6: >50 | 22 | 2% |
| Total | 918 | ‑ |
Table 6‑B
| Forensic specialists’ assessment range | # | % |
|---|---|---|
| Range 1: 90‑100 | 74 | 8% |
| Range 2: 80‑89 | 303 | 33% |
| Range 3: 70‑79 | 304 | 33% |
| Range 4: 60‑69 | 153 | 17% |
| Range 5: 50‑59 | 33 | 4% |
| Range 6: >50 | 51 | 6% |
| Total | 918 | ‑ |
Table 6‑C shows how the above values translate into forensic specialists’ standard definitions.
Table 6‑C
| Forensic specialists’ ranking | Number | Percent |
|---|---|---|
| Level 3 (Range 1,2) | 377 | 41.1% |
| Level 2 (Range 3,4) | 457 | 49.8% |
| Level 1 and below (Range 5,6) | 84 | 9.2% |
| Total | 918 | ‑ |
Table 6‑D shows the difference between the system score and the forensic specialists’ assessment.
Table 6‑D
| Forensic specialist assessment | Number | Percent |
|---|---|---|
| Agree with system | 557 | 61% |
| Disagree with system | 361 | 39% |
| Forensic specialist Increased score | 232 | 25% |
| Forensic specialist Decreased score | 129 | 14% |
| Total | 918 | ‑ |
Table 6‑E and Figure 6‑U show the quality of the fingerprints (all 10), as assessed by the forensic specialists, for those sampled with the verification fingerprints captured.
Table 6‑E: Quality of verification fingerprint
| Forensic specialists’ comment | Number | Percent |
|---|---|---|
| Sufficient quality | 645 | 70% |
| Ghosting | 59 | 6% |
| Cut off | 199 | 22% |
| Bottom | 39 | 4% |
| Side | 4 | 0% |
| Top | 146 | 16% |
| Multiple areas | 10 | 1% |
| Poor fingerprints | 2 | 0% |
| Multiple problems | 9 | 1% |
| Digital distortion | 3 | 0% |
| Other | 1 | 0% |
| Total | 918 | ‑ |
Figure 6‑U: Quality of Verification Prints as per Forensic Specialist
Analysis by the forensic specialists showed that approximately 70% of the time the fingerprints were of suitable quality. Several factors, including ghosting, made the fingerprint impressions of lesser quality. Because ghosting was discovered early in the field trial and was corrected, it should not have any significant consequence in the future. Another problem, which accounted for 22% of all problems, was the cutting off of portions of the impressions. Some of the fingerprints had the top, sides or bottom cut off, making searching difficult. This problem is easily corrected using updated software and better training for the operators. Any other problems were minor and did not account for more than 1% of all problems. The most significant were poor impressions lacking sufficient friction ridge detail owing to ageing or work.
6.3.5 Visa applicants’ enrolment scores versus refugee protection claimant enrolment scores
For refugee protection claimants, the NIST (National Institute of Standards and Technology) record from the LiveScan system does not provide fingerprint image quality scores; it does, however, provide template enrolment scores. Since this metric is also available for visa fingerprints, Figures 6‑V and 6‑W contrast the 10 slap fingerprint sets collected from visa applicants with 10 rolled sets [note 12] collected from refugee protection claimants.
Figure 6‑V: TRV Fingerprint Template Quality Score

Figure 6-W: Refugee Fingerprint Template Quality Score

6.3.6 Fingerprint matching performance—One‑to‑many identification
Of the 18,264 files created during the field trial, 8,213 sets of 10 fingerprints were collected from visa applicants and 3,410 sets from refugee protection claimants, yielding a gallery* of 11,623 10‑fingerprint sets, as seen in Figure 6‑X.
Figure 6-X: Count of Initial 10-fingerprint sets
*A Gallery is the set of enrolled biometric images that will be searched against.
Matching on multiple 10‑fingerprint sets was possible in the case of individuals who did one of the following:
- applied multiple times for a visa
- applied for a visa and made a refugee protection claim
As described in section 6.2.2, the field trial consisted of 18,264 client enrolments, and 364 individuals interacted with CIC at least twice during the trial. There were a total of 394 pairs of possible matches.
The results for fingerprint matching when combined with facial recognition are presented in 7.3.2.
6.3.7 Correct matching for identification
Of the 394 possible matches, 17 pairs of possible matches using fingerprints only were found because individuals had provided incorrect or invalid photos. In all cases, the matching score was above the threshold set by the forensic specialists, so the system generated a correct match rate of 100%.
Of the 394 possible matches, there were 195 possible matches using a combination of facial recognition and fingerprints were found. Examining only the performance of fingerprinting, the scores of the correct matches were above the threshold set by the forensic specialists in 191 cases (97.9%).
In total, CIC had 212 potential matches using fingerprints, and the correct match count was 208 (98.1%). When CIC increased the threshold to boost the correct match rate to 100%, CIC would have included 2 (0.9%) incorrect matches (also known as false matches).
6.3.8 Matching performance—One to one
Of the 8,213 sets of 10 fingerprints prints collected from visa applicants, 918 single fingerprints taken at the ports of entry were enrolled into the biometric system for one‑to‑one matching purposes. Although all clients were instructed to present their right index finger, CIC asked that the system compare the presented fingerprint with all (usually 10) fingerprints enrolled. This request was made for two main reasons, which are expected to be desirable for a fully deployed system:
- All 10 fingerprints are in the system anyway. Comparing a single fingerprint against the person’s set of fingerprints eliminates the risk of either the officer requesting or the traveller placing the wrong finger, thereby causing a false rejection.
- This approach reduces the likelihood of a traveller trying to trick the system with a fake fingerprint by enabling the officer to request any finger at random.
The verification results are shown below. The unsuitable fingerprints were judged by the forensic specialists to be of too poor quality to assess whether or not a match existed.
Table 6-F presents, for verification, counts and percentage for correct acceptances, false rejects and false acceptances. In six cases, operational errors led to the wrong fingerprint being acquired.
Table 6‑F: Forensic specialists’ verification matching results
| System responses | Number | Percent |
|---|---|---|
| Unsuitable fingerprints | 36 | 3.9% |
| Total useable fingerprints | 882 | 96.1% |
| Correct acceptances | 810 | 91.8% |
| False rejects (same person, low score) | 52 | 5.9% |
| False acceptance (high score, wrong person) | 14 | 1.6% |
| Wrong fingerprint / person not verified | 6 | 0.7% |
| Total | 918 | 100.0% |
Footnotes:
6. 8,213 fingerprint sets from visa office clients seen in person and 3,410 sets from refugee protection claimants. There are no fingerprints for field trial clients who mailed in applications and who did not enter Canada at Vancouver.
7. While 1,020 one‑finger captures for verification purposes were performed during the field trial, only 918 could be enrolled into the biometric system (converted to a biometric template file).
8. All sample screens show test subjects, who did not participate in the field trial.
9. A total of 3,410 refugee protection claimants’ photos were enrolled, 3 did not end up with any photo on file. File corruption is suspected.
10. Photos with extremely low quality scores (14 for visas and 80 for refugees) could not be represented on this graph.
11. This meant that 3,000 impressions of individual fingers (not 3,000 clients) were not available for matching until the issue was resolved. Matching was successfully performed after the software fix.
12. Rolled prints refer to the more traditional way of capturing a person’s prints in which each finger is “rolled” from fingernail to fingernail. This results in a larger and more complete print. Slap prints are a relatively newer and only include the pressed surface of the print.
- Date Modified:

