Case Else:

/*** Code’s last stand ***/

Case Else: Killer Whales hunting off the Haida Gwaii (or Queen Charlotte Islands)

Matching Arbitrary Name Lists in Active Directory

September 14th, 2008 · 4 Comments

This Hello, My Name is stickerscript will take a CSV file of two columns — ostensibly in ‘last-name, first-name’ format — and attempt to convert it into valid userIDs, despite irregularities in the list.


I frequently receive requests to automate the retrieval of user IDs for a list of arbitrary names, such as from Active Directory. The name list largely consists of last names followed by first names, but some of the names are reversed. Some of the names are nicknames or initials.

In our example case, the names in Active Directory aren’t properly normalized either. Merging AD Domains from multiple, older networks often results in non-normalized data (or more specifically, differently normalized data). Names may be entered as proper names or nicknames, and nicknames may be notated in ad-hoc formats (e.g., in parentheses, etc). Names may be initialed—especially middle names—and abbreviations may or may not have periods. Last names may have suffixes (Jr, III, etc.) in a variety of formats.

Because the names in AD are not normalized, plus the arbitrary state of the names list we have to match, there is no way that I know of for a script to reliably normalize the list or the database, let alone compare them both. We could attempt to build-in dictionaries of names and compare them, but we would still fail on corner cases.

For example, if I tell you a full name is some arrangement of ‘Ernie’, ‘Ford’, and ‘Tennessee’, how can you guarantee what order the partial names are? While you may be able to make an educated guess, how do you write a script that guesses correctly? What about ‘John Olivia Newton’? Even if you built a dictionary of English names, what if I gave you ‘Ali Muhammad’ with no specified order?

No matter how we attempt to automate matching these names to the data, there will still have to be a manual reckoning. The goal for the script is to make that manual process as simple and automatic as possible so that it can be done quickly, efficiently, and—most importantly—by somebody else. The goal isn’t to get 100% accuracy—80% would be nice with some lists. On the other hand, manually resolving 20% of a list of thousands of names is still a big job, but the 80% saving is huge.

If you have such a file, all you actually have to do is drop it onto this script, and the script will create an output file in the same folder. You can also pass the file name as a command-line parameter. The script doesn’t take extra parameters—the output file will be the input filename with ‘.out.tab’ tacked onto it, etc.

Name dumps typically mix up first and last names, so this script tries to match names in both directions. The exception to that is the simplest translation. If ‘Smith, Andrew’ is in the file, and matches to ‘Smith, Andrew’ in AD, there is a strong indication (the comma) that the names are in the same order. That’s considered a quick match and warrants no further resolution.

If the simple translation fails, the script begins to run queries against AD, successively truncating the names (attempting to remove anything like initials, suffixes, etc.), and using them as wildcards until something is found, and eventually trying pieces of the names if necessary. The output includes profile information, as available, to help identify users.

Caveats

  • Quick matches run through a fast translation (the AD NameTranslate object built into XP). There is some room for error; if the names have duplicates, for example (note Gene Cox in the list below). If this assumption of accuracy doesn’t work for your situation, you can disable that function (set it to just return an empty string).
  • There is no realistic way for a script to know that ‘Bob Smith’ could be listed as ‘Robert Smith’, so it keeps broadening the net until something becomes plausible. If there is already a ‘B. Smith’ of some other name, broader scans will not run, so Robert may never show in the list.
  • Some queries (such as Bob Brown or Will Smith) may return too many names to be useful. There is a configurable threshold for that eventuality.

You will notice that the output file is a .tab (which is probably an unrecognized type on your system) and that it looks oddly formatted. The file is designed so that you can run through the output fairly rapidly in a text editor and delete extraneous names, leaving the correct user IDs. Open the remainder in Excel (just right-click the file, and open it with Excel, in most cases), and all of the user IDs are in the third column—so even though editing is a manual process, it’s still relatively quick and easy.

Attached is a sample of the output, with mostly trouble names included.

Wesley Walker	=	OurCorp\WWalker
Betn Gotham	=	OurCorp\BGotham
MacLarsen, William C. (Carlson) = ...
      {searching for 'MacLarsen' & 'William C. (Carlson)'}
      Carlson MacLarsen (Carlson MacLarsen Project Office) (MAC Services - Office) =	CMacLars
Ferrous, B D = ...
      {searching for 'Ferrous' & 'B D'}
      Ferrous, Brandon (Desktop Ops) =	BFerrou
      Ferrous, BD (Corporate Services) =	BDFerro
O'Malley, Katherine = ...
      {searching for 'O''Malley' & 'Katherine'}
      Katherine O'Malley (Company User Administration) (Company Administration) =	KOMalle
      Katherine O'Malley (Network & Server) =	KOMalle
Thistle, Bob = ...
      {searching for 'Thistle' & 'Bob'}
      did not find 'B* Thistle' or 'B* Bob'
      Broadening Search...
      {searching for 'Thistle'}
      Robert Thistle (Network & Server Group) =	RThistle
      Thistle, Alex (Michael Robert) =	AThistle
      {searching for 'Bob'}
      did not find '* Bob'
White, Clarence = ...
      {searching for 'White' & 'Clarence'}
      Found 57 of 'C* White' or 'B* Clarence'. 30 is too many.
Garofolo, Gerald = ...
      {searching for 'Garofolo' & 'Gerald'}
      Jerry Garofolo (Project Coordinator) (Business Support) =	JGarofol
Gene Cox      =     DOMAIN\ADMIN
Priestley Jr, Carl = ...
      {searching for 'Priestley Jr' & 'Carl'}
      did not find 'C* Priestley Jr' or 'C* Carl'
      Broadening Search...
      {searching for 'Priestley Jr'}
      did not find '* Priestley Jr'
      Trying just the first part (Priestley)
      {searching for 'Priestley*'}
      Nick Priestley =	NPriestle
      Priestley, Carl (Legacy Support)  =	CPriestle
      {searching for 'Carl'}
      Kelly Carl (HR) =	KCarl
      John Carl (HR) =	JCarl

Below is the script:

Option Explicit

Const LDAPOU = "dc=domain,dc=Company,dc=NET"
Const TOOMANYCHOICES=30

Const ForReading = 1
Const ForWriting = 2

dim objArgs
Set objArgs = WScript.Arguments
If objArgs.count < 1 Then
	wscript.echo "Usage: userstocomputers.vbs input.csv"
	wscript.quit
End If

Dim strInputFileName, strOutputFileName

strInputFileName = objArgs(0)
strOutputFileName=strInputFileName & ".tab"

Dim objFSO, objInputFile, objOutputFile, szScratch, szScratch1
Set objFSO = CREATEOBJECT("Scripting.FileSystemObject")
Set objInputFile = objFSO.OpenTextFile(strInputFileName,ForReading,false)
Set objOutputFile = objFSO.OpenTextFile(strOutputFileName,ForWriting,true)

Dim strLineText, objSystemSet, objSystem
While not objInputFile.AtEndOfStream

	strLineText = split(Trim(objInputFile.Readline), ",")
	if ubound(strLineText) > 0 then
		strLineText(0)=trim(strLineText(0))
		strLineText(1)=trim(strLineText(1))
		szScratch= strLineText(1) & " " & strLineText(0)
		szScratch1=TranslateADName(4, 3, szScratch)
		if len(szScratch1) > 0 then
			objOutputFile.WriteLine szScratch & vbtab & "=" & vbtab & szScratch1
		else
			szScratch= strLineText(0) & ", " & strLineText(1)
			szScratch1=TranslateADName(4, 3, szScratch)
			if len(szScratch1) > 0 then
				objOutputFile.WriteLine szScratch & vbtab & "=" & vbtab & szScratch1
			else
				objOutputFile.WriteLine szScratch & " = ..."

				if LookupADName( strLineText(0), strLineText(1) )=0 then
					objOutputFile.WriteLine vbtab & "Broadening Search..."
					if LookupADName(strLineText(0), "")=0 then
						if instr(strLineText(0), " ")>0 then
							objOutputFile.WriteLine vbtab & "Trying just the first part (" & left(strLineText(0),instr(strLineText(0), " ")-1) & ")"
							LookupADName left(strLineText(0),instr(strLineText(0), " ")-1)&"*", ""
						end if
					end if
					if LookupADName(strLineText(1), "")=0 then
						if instr(strLineText(1), " ")>0 then
							objOutputFile.WriteLine vbtab & "Trying just the first part (" & left(strLineText(1),instr(strLineText(1), " ")-1) & ")"
							LookupADName left(strLineText(1),instr(strLineText(1), " ")-1)&"*", ""
						end if
					end if
				end if
			end if
		end if
	elseif ubound(strLineText) = 0 then
		strLineText(0)=trim(strLineText(0))
		szScratch1=strLineText(0)  & vbtab & "=" & vbtab & TranslateADName(4, 3, strLineText(0) )
			if len(szScratch1) > 0 then
				objOutputFile.WriteLine szScratch & vbtab & "=" & vbtab & szScratch1
			else
				objOutputFile.WriteLine szScratch & " = ..."
				LookupADName strLineText(0), ""
			end if
	end if
Wend
Wscript.Echo "Done"

Function TranslateADName(FromType, ToType, FromName)
	Dim objTranslator
	Const ADS_NAME_INITTYPE_GC					= 3

	Const   ADS_NAME_TYPE_1779					= 1
	Const   ADS_NAME_TYPE_CANONICAL				= 2
	Const   ADS_NAME_TYPE_NT4					= 3
	Const   ADS_NAME_TYPE_DISPLAY					= 4
	Const   ADS_NAME_TYPE_DOMAIN_SIMPLE			= 5
	Const   ADS_NAME_TYPE_ENTERPRISE_SIMPLE			= 6
	Const   ADS_NAME_TYPE_GUID					= 7
	Const   ADS_NAME_TYPE_UNKNOWN				= 8
	Const   ADS_NAME_TYPE_USER_PRINCIPAL_NAME		= 9
	Const   ADS_NAME_TYPE_CANONICAL_EX				= 10
	Const   ADS_NAME_TYPE_SERVICE_PRINCIPAL_NAME		= 11
	Const   ADS_NAME_TYPE_SID_OR_SID_HISTORY_NAME	= 12 

	Set objTranslator = CreateObject("NameTranslate")

	on error resume next
	objTranslator.Init ADS_NAME_INITTYPE_GC, ""
	objTranslator.Set FromType, FromName

	TranslateADName = objTranslator.Get(ToType)
	on error goto 0
End Function

Function LookupADName(fname, lname) ' Returns number of records
	dim objConnection, objCommand, objRecordSet, searchscope, strItem
	On Error Resume Next
	Const ADS_SCOPE_SUBTREE = 2

	fname=trim(replace(fname, "'", "''"))
	lname=trim(replace(lname, "'", "''"))
	if len(fname)=0 and len(lname)=0 then exit function

	Set objConnection = CreateObject("ADODB.Connection")
	Set objCommand =   CreateObject("ADODB.Command")
	objConnection.Provider = "ADsDSOObject"
	objConnection.Open "Active Directory Provider"
	Set objCommand.ActiveConnection = objConnection

	objCommand.Properties("Page Size") = 1000
	objCommand.Properties("Searchscope") = ADS_SCOPE_SUBTREE
	if len(fname)=0 or len(lname)=0 then
		lname = fname & lname
		if lname = "*" then
			objOutputFile.WriteLine vbtab & "Skipping '*'"
			exit function
		end if
		objOutputFile.WriteLine vbtab & "{searching for '" & lname & "'}"
		objCommand.CommandText = "SELECT sAMAccountName, displayName, department, description FROM 'LDAP://" & LDAPOU & "' WHERE objectCategory='user' AND " & _
			"sn='" & lname & "'"
		searchscope="'* " & lname & "'"
	else
		objOutputFile.WriteLine vbtab & "{searching for '" & fname & "' & '" & lname & "'}"
		objCommand.CommandText = "SELECT sAMAccountName, displayName, department, description FROM 'LDAP://" & LDAPOU & "' WHERE objectCategory='user' AND " & _
			"( (givenName='" & left(lname,1) & "*' AND sn='" & fname & "') OR " & _
			"(givenName='" & left(fname,1) & "*' AND sn='" & lname & "') )"
		searchscope="'" & left(lname,1) & "* " & fname & "' or '" & left(fname,1) & "* " & lname & "'"
	end if
	Set objRecordSet = objCommand.Execute
	objRecordSet.MoveFirst
	LookupADName = objRecordSet.RecordCount
	if objRecordSet.RecordCount>=TOOMANYCHOICES then
		objOutputFile.WriteLine vbtab & "Found " & objRecordSet.RecordCount & " of " & searchscope & ". " & TOOMANYCHOICES & " is too many."
	elseif objRecordSet.RecordCount=0 then
		objOutputFile.WriteLine vbtab & "did not find " & searchscope
	else
		Do Until objRecordSet.EOF
			objOutputFile.Write vbtab & objRecordSet.Fields("displayName").Value
			if not isnull(objRecordSet.Fields("description").Value) then
				For each strItem in objRecordSet.Fields("description").Value
					if len (strItem)>0 then objOutputFile.Write  " (" & strItem & ")"
				Next
			end if
			if len(objRecordSet.Fields("department").Value)>0 then objOutputFile.Write  " (" & objRecordSet.Fields("department").Value & ")"
			objOutputFile.WriteLine  " ="& vbtab & objRecordSet.Fields("sAMAccountName").Value
			objRecordSet.MoveNext
		Loop
	end if
	on error goto 0
End Function

Tags: Uncategorized

4 responses so far ↓

  • 1 Steve Miller // Feb 24, 2009 at 5:18 pm

    Thank you for this script!

  • 2 Rod // Oct 13, 2009 at 12:41 pm

    Hi Allen,

    Thanks for providing this script.

    Once I get this working I should be able to mitigate all the different naming conventions I have found in the local ADUC.

    How do you execute this script?

    Again, many thanks…

    Rod

  • 3 Neil // Oct 13, 2009 at 1:18 pm

    Rod,
    If you run it, it will give you the run paramters, but basically you just run it as “userstocomputers.vbs input.csv” handing it a csv file of names. (Technically, it doesn’t have to be a csv: it’s written to handle columns of first-name/last-name in any order.)

    It will create an output file named “input.csv.tab”

  • 4 Neil // Oct 13, 2009 at 1:23 pm

    Or, you can just drag-and-drop a list onto the script. It should create the output file next to the input file.

Leave a Comment