File Import Overview
The topics in this section provide an overview of the File Import process, the data that can be imported, and troubleshooting information.
File Import Process - Summary
File imports use Azure Storage file share folders for managing file import source files. This section summarizes the use of these folders.
-
<prefix>-drop - Stores the source file from which data will be imported.
-
<prefix>-processing - The source file is stored in this folder during the import. An integration log record will be created when the file is moved to this folder.
The following temporary folders will be automatically created:-
<source-file>: Contains temporary json files. The count of json files will depend on the count of records to be imported and the batch size.
Example: The count of records in the source file for the TOEFL test is 900, and the batch size is 100. Hence the source file will be divided into 9 batches of 100 records each, which will all be imported simultaneously. If the batch size is greater than the count of records in the source file, all records will be imported in a single batch.
-
<source file name>-<Error>: This folder will be created:
-
Only if the source file contains any errors.
-
Within the <source-file> folder and contains records that fail to import because of an error.
-
These folders will be deleted when the source file is moved to the <prefix>-processed folder.
-
-
<prefix>-processed - The source file will be moved to this folder when the import is complete in all import scenarios, i.e., if the import is successful, partially successful or fails. The integration log record that was created earlier will be updated when the import is complete.
-
<prefix>-error - Contains a file of records that fail to import (format: <source file name>-Error). The extension of the error file will be identical to the source file.
Reference Data
Records of entities in the following table can be imported or updated (if they are already available). The table describes details of infrastructure that is available by default to enable the import of reference data into Anthology Reach:
Entities | Microsoft Flow Names | Integration Mapping Records in Which Batch Size is Set | Folders for Processing Import | |
---|---|---|---|---|
Records of the following reference entities can be imported:
|
|
ReferenceData Import Batch Size |
|
- Copy the flow and edit logic in the copied flow.
- Save and publish the copied flow in your institution’s implementation.
Before Importing Reference Data
- Format the Source File Name and Set the Template Type in the Mapping Record
For each entity, the name of the source file must be in the format <Value in the Template Type field in the Integration Mapping record>_<User-specified text>.<.xlsx or .csv>. For example, the name of the source file for the Academic Period entity will be Default_AcademicPeriod_<User specified text>.<.xlsx or .csv>.
If your institution is importing records of a custom reference entity, the name of source file name must be in the format <Value in the Template Type field in the Integration Mapping record>_<User-specified text>.<xlsx or .csv>.
The name of the vendor and the entity must be set in the Template Type field in associated Integration mapping records.
Example
The name of the vendor is Acme Corp and the name of the reference entity is School Name. These details must be specified as follows:
Source file name: AcmeCorp_SchoolName_Import322.<xlsx or .csv>
Value in the Template Type field: AcmeCorp_SchoolName
By default, the following integration mapping records will be available for each reference data entity:Record Name Important Fields and Their Values ReferenceData-<Entityname>.EntityType - Parameters – schema name of the entity
- External Field Name and Internal Field Name – EntityType
- Data Transformation Type – CONCATENATE
ReferenceData-<Entityname>.Externalidentifier - Parameters – schema name of the field that stores the external ID of the entity
- External Field Name and Internal Field Name – ExternalIdentifier
ReferenceData-<Entityname>.ExternalSourceSystem - Parameters – schema name of the field that holds the external source system value of the specified entity.
- External Field Name and Internal Field Name – ExternalSourceSystem
To import records of custom reference data entities specific to your institution, for each entity:
- Create a copy of the above records with identical naming conventions.
- Ensure that the indicated values are specified in the above fields.
Test Score Data
Records of entities in the following table can be imported or updated (if they are already available). The table describes details of infrastructure that’s available by default to enable the import of test score records into Anthology Reach:
Entity | Microsoft Flow Names | Integration Mapping Records in Which Batch Size is Set | Folders for Processing Import | |
---|---|---|---|---|
Test Score
Records of the following entities will also be created or updated if their details are available in the source file:
|
The above flows are supported by the following flows:
|
For each test, the count of records that will be imported in each batch is set in the following integration mapping records:
By default, 100 test score records of each test can be imported in every import batch. The administrator can change this value in the Internal Option Value field in the above Integration Mapping records. Example The count of records in the source file for the TOEFL test is 900, and the batch size is set is 100. Hence the source file will be divided into 9 batches of 100 records each, which will all be imported simultaneously. If the batch size is greater than the count of records in the source file, all records will be imported in a single batch. |
ACT (The source file must be in .csv format) |
|
GMAT |
|
|||
GRE |
|
|||
IELTS |
|
|||
SAT |
|
|||
TOEFL |
|
For GMAT test scores being imported from .xlsx or .csv source files, a framework is available to import records if multiple column headers are identical in the source file. In such a scenario, in the associated integration mapping record, suffix the position of the column in the source file to the value in the field External Field Name. This will help to differentiate between the identical column headers.
Example
In the source file, the column header City occurs twice in columns 9 and 20. Before performing an import, in their integration mapping records, suffix their positions in the field External Filed Name.
In the first record, the External Field Name will change from City to City-9. In the second record, it will change to City-20. By default, column numbers are suffixed to values in the field External Field Name. Administrators can modify this framework to be implemented in their institution.
The IsGMATOrderingRequired integration mapping record governs this behavior. In this record, the value of the field Internal Option Value is set to True, indicating that the framework is enabled.
To Start the Import
- In Microsoft Azure, upload a source file to the <prefix>-drop folder.
The extensions of files can be different in each folder. For example, the SAT source file can have the .xlsx extension, the GRE file can have the .csv extension, and multiple reference entity files placed in the folder referenceData-drop can have different extensions.
Caution
Ensure that you place the source file in the correct drop folder. - The import operation will start on the next run of the associated flow. By default, the flows are set to run daily at 12:00 hours UTC. The administrator can change these settings and also run the flow manually.
During the import, the source file will be moved to the <prefix>-processing folder.
When the import is complete, the source file will be moved to the <prefix>-processed folder in all import scenarios, i.e., if the import is successful, partially successful or fails.
Notes:
- The import of blank rows in source files will be skipped and empty records will not be created.
- If errors occur when content:
- In .txt format is imported, the errors will be displayed as follows in the file in the <prefix>-error folder:
<Source file content>||<Row number> <Error details>|<Flow URL> This text will be displayed at the end of the row.
Before you plan to import content from the error file, for each error, delete all content after the || (Double piping) characters. - In .xlsx or .csv format is imported, in the <source file name>-Error.<extension> file, delete all content in the errorRowNumber, errorDetails and flowURL columns, including the column headers. These columns will be displayed at the end of the file.
To import content from the error file, save the file with a unique name after performing the above deletions and then place the file in the <prefix>-drop folder. The import operation will start on the next run of the appropriate flow.
- In .txt format is imported, the errors will be displayed as follows in the file in the <prefix>-error folder:
- The Integration Status field in the associated Integration Log record that’s created will be set to the following values:
- Success: Indicates that the import operation is successful. The text in the Details field will be Data from file <source file name>.<extension> is imported successfully.
- Failed: Indicates that the import operation encountered an error. Text in the Details field will be Data from file <Source file name>.<extension> failed for <count> rows, created error file with the name <Source file name>-Error.<extension>.
- For every source file placed in the <prefix>-drop folder, a unique Integration Log record will be created.
Duplicate Checks for the Import Process
The topics in this section provide information on how duplicate records are handled during an import process.
Verification to Identify Duplicate Record
Import operations include 3 levels of verification to verify if records being imported are duplicate:
- Level 1: In the case of vendors that share 2-way integration with Anthology Reach, the GUID originally sent from Anthology Reach will be returned when records are imported from the same source. In this scenario, changed records from the source system will be updated.
- Level 2: If the incoming record does not have an Anthology Reach GUID but has its unique ID, and if a record from the same source with the same unique ID is available, the record will be updated. Otherwise, a new record will be created in Anthology Reach after an internal duplicate check is performed based on field matching criteria.
- Level 3: Institutions can configure native duplicate check criteria that are inbuilt in Microsoft Dynamics 365. These will be triggered if the GUID of level 1 and unique ID of level 2 are not passed into Anthology Reach during an import.
In all scenarios:
- A new record will be created if it was not previously available.
- The record will be updated if it was previously available.
- An error will be logged if the record being imported matches with multiple destination records.
This 3-level verification process enables institutions to configure duplicate check criteria that are unique to their institution.
During an import to update records, the following conditions are included along with native duplicate check functionality that's within Microsoft Dynamics 365:
Entity | Fields | Criteria |
---|---|---|
Application | Application Registration | Exact match |
Duplicate Detection Rule: Application with same Application Registration, Application Period, Program and Program Version | Program | Exact match |
Program Version | Exact match | |
Application Period | Exact match | |
Application Registration | Application Definition Version | Exact match |
Duplicate Detection Rule: Application Registration with same Application Definition Version and contact | Contact | Exact match |
Contact | First Name AND | Exact match |
Duplicate Detection Rule: Contact with same Last Name, First Name and BirthDate | Last Name AND | Exact match |
Date of Birth | Same Date | |
Contact | National Identifier | Exact Match |
Duplicate Detection Rule: Contacts with same National Identifier | ||
Experience | Contact AND | Exact match |
Duplicate Detection Rule: Experience with the same Contact, Title, and Organization | Title AND | Exact match |
Organization ID | Exact match | |
ExtraCurricular Participant | Student AND | Exact match |
Duplicate Detection Rule: Extra Curricular Activity Participant with the same Contact and Extra Curricular Activity | Extra Curricular Activity | Exact match |
Previous Education | Student AND | Exact match |
Duplicate Detection Rule: Previous Education with the same Contact, School Name, and Education Level | School Name AND | Exact match |
Education Level | Exact match | |
Test Score | Student ID AND | Exact match |
Duplicate Detection Rule: Test Score with the same Contact, Test Type, and Test Source. | Test Type AND | Exact match |
Test Source AND | Exact match | |
Test Date | Same Date and Time |
Import Failure Scenarios
Some of the scenarios when import can fail include:
- When multiple records with the same information are already available. For example, two records of the contact Alex Hales are available in Anthology Reach, and the source file also contains a record with the same information.
- If an Integration Mapping record does not exist for a specific field of the entity.
Viewing Import Failure Error Messages
Depending on the type of error, the error messages for import failure are logged in one of the following locations:
In the error log (CSV) file in the <source file name>-<Error> folder on Azure portal.
In the Integration log record (under Settings > Integration) in Anthology Reach application.
Troubleshooting Tips
- Ensure that integration mapping records are available for all source fields and their associated values in destination records.
- Before importing content, ensure that information in the data source file is correct. This will prevent records from being omitted in the import operation.
- Ensure that you place the source file in the appropriate <prefix>-drop folder.