LanguagesImage Recognition in Python and SQL Server

Image Recognition in Python and SQL Server

As rich of a solution as SQL Server is, there are times when we run into significant limitations with its built in functionality. Any database application developer knows that it is easy to manipulate data in SQL Server, or any database server, by way of scripting languages or customized programming. However, what if the utility you needed to use was not natively designed to interact with a database, or modifying a program to interact with a database was “a bridge too far” in terms of development? We cover this very topic in today’s database programming tutorial.

The Scenario: Image Recognition App Example

You need to identify the contents of images that another application uploads into a SQL Server database table. The images are stored as BLOBs and, because most Image Recognition APIs should not be used in real time, you need to run a scheduled task that can extract the contents of an image and make use of an external Image Recognition API for the purposes of identifying what is in each image. The primary reason for wanting to do this offline as a separate process is because duplicate images can be filtered out, and, as most Image Recognition APIs charge by the image, it makes more sense to cache the results and use a script to only process images whose contents have not been identified.

The practice of moving complex and intensive data manipulation into the purview of an external “back end” process is common and necessary in most use cases. It would be much more difficult to do this kind of “process filtering” on the fly.

Read: Top Online Courses to Learn Python Programming

Working with SQL Server, Python, and Image Recognition API

This programming tutorial makes use of the Imagga API for basic image recognition. It is one of many image recognition APIs which can be accessed programmatically. This particular API returns tags and confidence levels in JSON format.

Let’s begin developing our software by getting some sample images loaded into an SQL Server table. Whatever application may have inserted these images is irrelevant, because the process that will be created in this article will be occurring outside of the functionality of any application which inserted these images in the first place.

Image Recognition Example

Figure 1 – Statue of Liberty near Harrisburg, PA

Python Image Recognition Example

Figure 2 – Chicago Theater

SQL Server and Python Example

Figure 3 – Colony Hotel on Ocean Drive, Miami Beach, FL

Programming Image Recognition

Figure 4 – Red Barn near Wardensville, WV

Python Image Recognition

Figure 5 – Muddy Creek Falls, Swallow Falls State Park, MD

Python Image Recognition Programming

Figure 6 – Blackwater Falls State Park, Davis, WV

Read: Text Scraping in Python

Creating a Sample Database with T-SQL

The T-SQL below was run against SQL Server Express, but it can work in any version of SQL Server. The code below was used to create the tables in a database called Image_Identity_Demo and it is run in SQL Server Management Studio (SSMS):

use Image_Identity_Demo;

create table uploaded_images
(record_id int not null identity primary key,
filename varchar(100) not null,
filedata varbinary(max) null);

create table image_tags
(record_id int not null identity primary key,
tag varchar(100) not null unique);

create table image_tag_associations
(record_id int not null identity primary key,
image_id int not null references uploaded_images(record_id) on delete cascade,
tag_id int not null references image_tags(record_id) on delete cascade,
confidence real not null default 0);
                 
Listing 1 - Creating tables to store image data

This table structure may seem somewhat convoluted and overly complex for a simple demonstration, but in this writer’s opinion, 3NF database structure for relational databases is a lost art, and a good database structure eliminates any duplicated data.

The T-SQL below will insert the data for 3 files into the uploaded_images table:

use Image_Identity_Demo;

-- Note that SQL Server often has problems accessing directories under a given user's account.
-- For this reason, sample images will be stored in a "common" directory under the root 
-- directory of the C drive.


insert into uploaded_images ([filename], filedata)
	select 'blackwater-falls-state-park.jpg', dataFromFile.*
	from openrowset(bulk 'C:\Projects\Image-Identity\blackwater-falls-state-park.jpg', single_blob) as dataFromFile;

insert into uploaded_images ([filename], filedata)
	select 'muddy-creek-falls.jpg', dataFromFile.*
	from openrowset(bulk 'C:\Projects\Image-Identity\muddy-creek-falls.jpg', single_blob) as dataFromFile;

insert into uploaded_images ([filename], filedata)
	select 'red-barn-wardensville-wv.jpg', dataFromFile.*
	from openrowset(bulk 'C:\Projects\Image-Identity\red-barn-wardensville-wv.jpg', single_blob) as dataFromFile;            

Listing 2 - Inserting binary data into the table

Selecting the data from the uploaded_images table after execution gives the following results:

T-SQL Image Recognition

Figure 7 – The sample data populating the uploaded_images table

Read: How to Create Your First Python GUI Application

Programming Image Recognition Examples

SQL Server, like most databases, is not suited to connecting to an outside API for the purposes of getting image recognition data, or many other kinds of metadata, and while SQL Server can be modified to allow for external programs to be executed from within itself, this is a very bad idea from a security and performance standpoint. A database is not intended to be an external task scheduler, and it should not be configured to try to be one.

Important note: One very bad way to use SQL Server to connect to an external data source is to use triggers on a table insert or update. If the external process fails or takes too long to execute, this can cause SQL Server to block on the row involved, or worse, the table itself. If there are multiple instantiations of such triggers, it can cause unpredictable and undesirable outcomes.

On a purely editorial note, triggers tend to cause unpredictable “weirdness” in databases. Any kind of database action that is to be the result of data being inserted, changed or deleted in a table should be coded into the application which originally initiated that action.

Python and Image Recognition

Python provides a flexible and platform-independent means to create tools which can call APIs and update the database with the information collected. Such tools can later be called by the Operating System’s preferred task scheduler. The use of a Python script also helps to address the following limitations that we may encounter if we were to use Windows Batch Files and the SQLCMD.EXE utility included with SQL Server:

  • SQLCMD.EXE is ideally suited for outputting a single file at a time.
  • As there can be more than one file that needs to be processed, multiple calls to SQLCMD.EXE would be needed.
  • Using a Windows Batch File not only limits this to Windows but presents other issues:
  • The syntax can become very complicated when looping and exception handling are involved.
  • The language is not as well-known as it used to be, so support can be difficult.
  • There are no easy tools to properly escape external data before passing it back into SQL Server.

The goals of this Python script would entail:

  • Determining which images need to be processed by the Image Recognition API.
  • Saving each one to a file.
  • Uploading that file to the Image Recognition API.
  • Downloading and securely saving the metadata generated by the Image Recognition API.
  • Deleting the file.

The following code uses Python 3.10.2 for Windows. Note that the pyodbc and requests modules had to be imported.

# C:\Users\\AppData\Local\Microsoft\WindowsApps\pip install pyodbc
import pyodbc
import sys
import os
# C:\Users\\AppData\Local\Microsoft\WindowsApps\pip install requests
import requests
import json

def main(argv):
    # You can set up a trusted connection by adding your Windows Account
    # as a user to your database.  The account then needs suitable access
    # such as db_datareader, db_datawriter, etc.
    conn = pyodbc.connect("Driver={ODBC Driver 17 for SQL Server};"
            "Server=.\SQLEXPRESS;"
            "Database=Image_Identity_Demo;"
            "Trusted_Connection=yes;")
    sql = "select a.* from uploaded_images a where (0 = (select count(*) from image_tag_associations b where b.image_id=a.record_id));"
    rs = conn.cursor()
    rs.execute(sql)
    rows = rs.fetchall()
    rs.close()
    
    if len(rows) == 0:
        print ("Either there are no files loaded, or all have tags.")
    # Normally, it is far more efficient to build a single large SQL batch and execute that, but because this script
    # is being run in the background, efficiency is not as big a concern as it may be in an interactive application.
    # This allows for us to use pyodbc's built-in SQL escaping tools to prevent SQL injection attacks as opposed to
    # having to build a subroutine which escapes all the strings in the SQL statement in order to achieve that end.
    for row in rows:
        # each column in the recordset is identified by a 0-based array index.
        fileID = int(row[0])
        filename = str(row[1])
        filedata = bytearray(row[2])
        # delete the file if it already exists.
        if os.path.exists(filename):
            os.remove(filename)
        outfile = open(filename, "w+b")
        outfile.write(filedata)
        outfile.close()
        # we now have a file to upload.
        
        url = "https://api.imagga.com/v2/tags"
        api_key = 'your_api_key'
        api_secret = 'your_api_secret'
        inFile = open(filename, 'rb')
        uploadedFile = {'image': inFile}

        myRequest = requests.post(url, files = uploadedFile, auth = (api_key, api_secret))
        #print(myRequest.text)
        outJson = myRequest.text;
        del myRequest
        inFile.close()
        os.remove(filename)
        
        tagData = json.loads(outJson);
        for item in tagData["result"]["tags"]:
            #print (item)
            confidence = float(item["confidence"])
            tag = item["tag"]["en"]
            #print ("Tag [" + tag + "] Confidence [" + str(confidence) +"]")
            values = [tag, tag, tag, fileID, confidence]
            sql1 = ("declare @recordCount int; "
                "select @recordCount = count(*) from image_tags where tag=substring(?, 1, 100); " # value[0]
                "if (0 = @recordCount) "
                "  insert into image_tags (tag) values (?);" # value[1]
                "declare @tagID int; "
                "select @tagID = record_id from image_tags where tag=? " # value[2]
                "insert into image_tag_associations (image_id, tag_id, confidence) values (?, @tagID, ?)")
            rs1 = conn.cursor()
            rs1.execute(sql1, values)
            rs1.commit()
            rs1.close()
        print ("Processed file [" + filename + "]")
    conn.close()
    return 0

# Call the "main" program.
if __name__ == "__main__":
    main(sys.argv[1:])

Listing 3 - Python Code to update image recognition information

Two additional benefits to using this particular code is that, should the Image Recognition API change, or the structure of the JSON output changes, updating the code is far more straightforward than it might be with a compiled program (or separate individual scripts) and secondly, compactness. This is all achieved in less than 100 lines of code!

Running this Python code example lists the three previously loaded files as output:

Python Programming and Images

Figure 8 – Running the code on the initially loaded images

One crucial feature of this code is that, once an image is processed, it is skipped on future runs. If an Image Recognition API charges by the query – or limits users to a certain number of queries – this feature can be helpful in mitigating costs.

Note, most commercial web-based API-delivered services have some sort of per-request charging structure, or they impose some sort of limit on the number of requests. This is true for search engines, image recognition APIs, and more. A good software solution will try to cache as many of these as possible.

Image Recognition Examples

Figure 9 – Running the same code, the existing images are skipped.

Querying the tables via SSMS confirms that the data was indeed saved:

Programming image recognition with Python

Figure 10 – Saved Image Data

Now let us go ahead and load more images:

use Image_Identity_Demo;

-- Note that SQL Server often has problems accessing directories under a given user's account.
-- For this reason, sample images will be stored in a "common" directory under the root 
-- directory of the C drive.

insert into uploaded_images ([filename], filedata)
	select 'chicago-theater.jpg', dataFromFile.*
	from openrowset(bulk 'C:\Projects\Image-Identity\chicago-theater.jpg', single_blob) as dataFromFile;

insert into uploaded_images ([filename], filedata)
	select 'colony-hotel.jpg', dataFromFile.*
	from openrowset(bulk 'C:\Projects\Image-Identity\colony-hotel.jpg', single_blob) as dataFromFile;

insert into uploaded_images ([filename], filedata)
	select 'statue-of-liberty-harrisburg.png', dataFromFile.*
	from openrowset(bulk 'C:\Projects\Image-Identity\statue-of-liberty-harrisburg.png', single_blob) as dataFromFile;           

Listing 4 - Loading the three remaining images

Running this Python code gives the following output:

Python and SQL Server

Figure 11 – Processing the three remaining images

Read: Best Python IDEs and Code Editors

Displaying Results with T-SQL

Collecting the data is not very useful if we cannot see the results of the image analysis. The following T-SQL Batch will show the top five tags associated with each image, ordered by decreasing confidence levels. This batch could have been written using an insert… output… statement, however, that would result in too many result sets being outputted, so the more “inefficient” approach of using nested cursors is instead used:

use Image_Identity_Demo;

declare @outputTable table(
filename varchar(100),
tag_value varchar(100),
confidence real);

declare @recordID int;
declare @filename varchar(100);
declare @tag varchar(100);
declare @confidence real;
declare cursor0 cursor for
  	select
		a.record_id,
		a.[filename]
	from
		uploaded_images a
	where (0 <> (select count(*) from image_tag_associations b where b.image_id=a.record_id))
open cursor0
fetch next from cursor0 into @recordID, @filename;
while (0 = @@FETCH_STATUS)
	begin
		--print cast(@recordID as varchar(10)) + ' - ' + @filename;
		declare cursor1 cursor for
			select top(5)
				b.tag, 
				c.confidence
			from
				image_tags b,
				image_tag_associations c
			where 
				c.image_id = @recordID and
				b.record_id = c.tag_id
			order by 
				c.confidence desc;
		open cursor1;
		fetch next from cursor1 into @tag, @confidence;
		while (0 = @@FETCH_STATUS)
			begin
				insert into @outputTable values (@filename, @tag, @confidence);
				fetch next from cursor1 into @tag, @confidence;
			end
		close cursor1;
		deallocate cursor1;
		fetch next from cursor0 into @recordID, @filename;
	end
close cursor0;
deallocate cursor0;

select * from @outputTable order by [filename], confidence desc;          

Listing 4 - Loading the three remaining images

And for these images, the results are quite interesting:

File Name

Tag

Confidence Level

blackwater-falls-state-park.jpg

river

61.31689

blackwater-falls-state-park.jpg

landscape

52.18626

blackwater-falls-state-park.jpg

water

50.82939

blackwater-falls-state-park.jpg

forest

49.03043

blackwater-falls-state-park.jpg

dam

46.71201

chicago-theater.jpg

cinema

100

chicago-theater.jpg

theater

100

chicago-theater.jpg

building

100

chicago-theater.jpg

structure

85.6278

chicago-theater.jpg

city

60.75557

colony-hotel.jpg

facade

78.82497

colony-hotel.jpg

building

78.073

colony-hotel.jpg

architecture

74.53635

colony-hotel.jpg

city

61.5429

colony-hotel.jpg

urban

45.45821

muddy-creek-falls.jpg

waterfall

87.5894

muddy-creek-falls.jpg

channel

85.40005

muddy-creek-falls.jpg

river

74.56114

muddy-creek-falls.jpg

body of water

70.08868

muddy-creek-falls.jpg

stream

65.62728

red-barn-wardensville-wv.jpg

barn

100

red-barn-wardensville-wv.jpg

farm building

100

red-barn-wardensville-wv.jpg

building

100

red-barn-wardensville-wv.jpg

structure

80.36546

red-barn-wardensville-wv.jpg

field

41.86442

statue-of-liberty-harrisburg.png

tower

66.32844

statue-of-liberty-harrisburg.png

structure

53.6522

statue-of-liberty-harrisburg.png

castle

51.55312

statue-of-liberty-harrisburg.png

architecture

41.88873

statue-of-liberty-harrisburg.png

sky

38.81767

Final Thoughts on SQL and Python Image Recognition

One rather interesting observation in the above output is that the Blackwater Falls was not recognized as a waterfall in the top five results. This brings us back to the issue of caching. Image Recognition is a relatively new technology, and sometimes, what a particular API may recognize as prominent may not necessarily be what a human might recognize as prominent. A future feature addition to this database application might be a means to track when an image is uploaded so that its tags can be deleted and reprocessed at a later time. It would be interesting to see how the API has changed what it finds as time passes on.

Additionally, as with any command line script, this logic can be scheduled using the Windows Task Scheduler, so that it can run at more convenient times of the day. Building on this, it would be nice to see that this “offline” approach to processing this kind of information is no longer necessary, as technology may progress to the point where image recognition can be done in real time.

Read more Python programming and software development tutorials.

Latest Posts

Related Stories