Sony received 12 granted patents across 5 categories: AI & Machine Learning (8), Cloud Gaming (1), VR & AR (1), Graphics (1), and Audio (1).
The majority of patents focus on AI & Machine Learning applications, covering systems that predict audience engagement for game trailers, generate game-ready avatars from selfies and body movements, customize NPC dialogue based on player behavior, and translate sign language into in-game chat messages. Additional AI & Machine Learning patents describe technology for capturing player biometrics to adapt gameplay and spectator viewing, filtering in-game messages by context, and dynamically adjusting Audio mixing when dialogue occurs. The remaining patents address priority-based variable encoding for Cloud Gaming bandwidth optimization, adaptive motion tracking for VR & AR controllers, dynamic avatar mesh morphing for Graphics, and voice-controlled gaming through spatial Audio processing.
Eight AI and machine learning patents tackle different aspects of player interaction and content creation. Sony's video analysis tool examines game trailers and promotional materials to forecast audience engagement before marketing campaigns launch, allowing studios to refine their content based on patterns from past successful releases. Several patents address personalization through AI, including one that converts user selfies into avatars matching each game's visual style, and another that generates characters from body movements captured on video combined with natural language descriptions. The NPC interaction system tracks how individual players communicate with in-game characters across multiple sessions, then adjusts dialogue and behavior to match observed patterns rather than relying solely on explicit player choices. A context-aware messaging system reformats communications between players based on their current gameplay situations, rewriting messages to preserve intent while fitting each recipient's in-game circumstances. The biometric capture technology monitors player physiology during gameplay to automatically bookmark emotionally intense moments for spectators and allows NPCs to modify their behavior when detecting stress or excitement. Voice recognition technology processes commands from multiple players simultaneously by combining spatial Audio positioning with individual voice profiles, eliminating the need to manually switch input devices. A gesture recognition system translates sign language into text chat, adjusting the output's tone and style based on current game state to create messages that fit the gameplay context.
Sony's single cloud gaming patent optimizes Streaming bandwidth by encoding different parts of each frame at variable quality levels. The system allocates higher resolution and frame rates to important game objects while reducing quality for background elements and less critical visual information, concentrating transmission resources where they matter most during active gameplay.
The VR and AR patent addresses tracking reliability by switching between camera-based SLAM and IMU sensors for controller positioning. The system monitors tracking quality metrics and selects the most reliable method at any given moment, preventing the disruptive jumps and corrections that occur when visual tracking fails in current implementations.
Sony's Graphics patent modifies character meshes based on player actions, creating visible changes that reflect how the avatar has been used. The system tracks action frequency within specific time windows and progressively deforms the character model to show progression, making stat changes and play style visible to other players in multiplayer environments without requiring manual customization.
The Audio patent uses AI to separate music vocals from background instrumentation when in-game dialogue or voice chat occurs. The system detects potential conflicts between vocal tracks in background music and spoken communications, then isolates and adjusts specific frequency ranges in real-time to maintain clarity without requiring pre-processed Audio assets or manual mixing adjustments.