Use artificial intelligence intelligently

Artificial intelligence, the use of computers to perform tasks that require intelligence in humans, has greatly improved in recent years. Old expectations about what computers can and can’t do must now be continually updated as computational capacity quickly evolves to perform increasingly more complex tasks.

Since 2022, generative AI (the use of AI techniques to generate high-quality text, image or video content) has improved dramatically, and this rapid improvement seems to be accelerating. The usefulness of generative AI continues to increase and it is being built into more systems. It is likely that soon, most systems will include at least some AI components.

Guidance for everyone

Recognize the possibility of incorrect results

Computer systems that use generative AI may produce output that looks very convincing, but the results may not be correct. Errors may not be apparent and information that is presented as factual may not be. Techniques that humans use to detect mistaken or deceitful content may not apply to content created by generative AI. A generative AI system can produce subtly or even completely mistaken results without exhibiting any signs of deceit, carelessness, inexperience, or any other indications that humans might show.

Results produced by generative AI systems should be carefully checked before use, even if they seem convincing. AI systems may invent results, sources, and even authors or researchers to better satisfy your query, or they may skew, bias or interpret results in ways that may better fit your request or expectations.

Be alert to misinformation

Generative AI systems can be used to create fictional or fraudulent content that looks genuine. The result is easily created, highly convincing misinformation:

  • Digital forgeries can now be created by anyone and no longer require expertise.
  • Photos and videos can be created of people who do not exist or real people doing things they never did.
  • Written and spoken text can be created that can match anyone, living or dead, and have that person saying anything.

As AI-powered computer systems can create false content that is difficult to distinguish from the real thing, it’s easier to commit fraud, and humans may no longer be able to distinguish between content fabricated by computers and content created by humans. To protect yourself from misinformation:

  • Use secure trusted systems and sources.
  • Use appropriate in-person human systems to verify claims, identities and validate transactions.
  • Beware of unverified internet-sourced information.

Recognize privacy and information security risks

Generative AI systems produce results by learning from data. Systems that use generative AI often record and store data provided to them, helping them to improve performance. If private data is shared with generative AI, this data may be revealed later in the results that the AI produces. Keep in mind:

  • No data should be provided to generative AI if any part of that data should not be included in results produced by that system.
  • If multiple people are using the system, one person’s data may potentially be revealed to someone else.
  • Be aware of whether the system shares data with other systems and platforms.
  • Do not share personal information or other data that resides within levels 2 to 4 of the Data Classification Standard with generative AI systems unless that sharing has been specifically authorized.
  • As AI functionality is increasingly added to Software-as-a-Service (SaaS) and cloud services, it’s important to examine potential changes to privacy policies, user agreements and contractual terms to ensure that data will be handled appropriately.
  • Copyright and contract law around AI is an evolving area and the University’s understanding will develop as new policies, regulations and case law becomes settled.
  • The University has established a risk assessment process for information systems that provides guidance for appropriate uses of data for a particular system or product. Please reach out to your local IT team for more information.

Guidance for people building or training systems with AI components

Generative AI is a special example of an approach to AI called machine learning (ML). ML is a method of teaching computers using data sets. Instead of humans creating instructions for computers to follow, each piece of training data can result in changes to the system’s internal values. A trained system can later make decisions about new data or provide output in response to new data based on the internal patterns it has learned from. As training is a vastly different technique for instructing computer systems than computer programming, there are new risk areas that people building ML systems need to consider.

Privacy and machine learning training data

ML systems learn from data during the training process. Training data may be revealed later in ways that can’t be predicted. For this reason, do not train ML systems with level 2-4 data unless all results from that ML system can be kept private.

Secure platforms

Those building their own applications should ensure that the platforms they use are set up with the highest level of security and safety features. Likewise, any source code that is being used for developing such applications should be assessed to ensure that they came from a trustworthy source to provide full protection of users and data. Miscreants may attempt to compromise the integrity of such applications in a variety of ways, including accessing end user data, altering source code, or in the case of AI applications, altering or subverting training data to introduce specific bias in the output of the application. To mitigate this latter risk, developers should regularly be examining, controlling and/or sanitizing data used for training to ensure that adversaries can’t subvert the behaviour of the system through careful manipulation of the training data.

Beware of adversarial training data

Examine, control and/or sanitize data used for ML training to ensure that the system’s behaviour can’t be subverted through adversary-provided training data.

Recognize the possibility of unintended bias in training data

The behaviour of an ML system is based on the training data that has been given and learned from. This may result in patterns within the data, even unexpected or undesirable patterns. Examples:

  • An ML system asked to distinguish between pictures of dogs and wolves may learn to distinguish one from the other using other indicators, such as snow and trees or other characteristics – not the animals themselves.
  • A system asked to distinguish between people visually may make an unwarranted association between race and traits if the training data contains more racially identifiable people for a specific trait.
  • If a training data set has been collected from higher socioeconomic status individuals, it may contain biases with respect to neighbourhood, lifespan, education, and many other characteristics or factors that correlate with their socioeconomic status.

In general, if there are any biases in training data, the system will end up learning those biases along with all the other patterns in the data, and then may produce prejudiced or inequitable results. For this reason, carefully examine data used for ML training for biases and address them when found.