* Sync keyboard text buffer from Flash before processing physical input The native keyboard scene maintained a separate C++ buffer (m_win64TextBuffer) for physical keyboard input, which was pushed to the Flash text field via setLabel(). However, when the user typed with the on-screen controller buttons, Flash updated its text field directly through ActionScript without updating the C++ buffer. This caused a desync: switching back to the physical keyboard would overwrite any text entered via controller, since m_win64TextBuffer still held the old value before the controller edits. Fix: read the current Flash text field into m_win64TextBuffer at the start of each tick(), before consuming new physical keyboard chars. This ensures both input methods always operate on the same state. * Use last active input device to decide keyboard mode instead of connection state The keyboard UI mode (on-screen virtual keyboard vs direct text input) was determined by Win64_IsControllerConnected(), which checks if any XInput controller is physically plugged in. This meant that even if the player was actively using mouse and keyboard, the virtual keyboard would still appear as long as a controller was connected. Replace the connection check with g_KBMInput.IsKBMActive(), which tracks the actual last-used input device based on per-frame input detection. Now the keyboard mode is determined by what the player is currently using, not what hardware happens to be plugged in. Affected scenes: CreateWorldMenu (world naming) and LoadOrJoinMenu (world renaming). * Fix TextInput caret behavior and add proper cursor editing for KBM direct edit The direct text editing mode introduced for KBM users had several issues with the TextInput control's caret (blinking cursor) and text manipulation: 1. Caret visible when not editing: When navigating to the world name field with keyboard/mouse, Flash's Iggy focus system would show the blinking caret even though the field wasn't active for editing yet (Enter not pressed). This was misleading since typing had no effect in that state. Fix: access the FJ_TextInput's internal m_mcCaret MovieClip and force its visibility based on editing state. This is enforced every tick because setLabel() and Flash focus transitions continuously reset the caret state. 2. No cursor movement during editing: The direct edit implementation treated the text as a simple buffer with push_back/pop_back — there was no concept of cursor position. Backspace only deleted from the end, and arrow keys did nothing. Fix: track cursor position (m_iCursorPos) in C++ and use wstring insert/erase at that position. Arrow keys (Left/Right), Home, End, and Delete now work as expected. The visual caret position is synced to Flash via the FJ_TextInput's SetCaretIndex method. 3. setLabel() resetting caret position: Every call to setLabel() (when text changes) caused Flash to reset the caret to the end of the string, making the cursor jump visually even though the C++ position was correct. Fix: enforce caret position via setCaretIndex every tick during editing, so any Flash-side resets are immediately corrected. New UIControl_TextInput API: - setCaretVisible(bool): toggles m_mcCaret.visible in Flash - setCaretIndex(int): calls FJ_TextInput.SetCaretIndex in Flash * Fix keyboard/arrow navigation not working when no UI element is focused On Windows64 with KBM, moving the mouse over empty space (outside any button) would clear the Iggy focus entirely. After that, pressing arrow keys did nothing because Flash had no starting element to navigate from. Two changes here: - Don't set focus to IGGY_FOCUS_NULL when the mouse hovers over empty space. The previous hover target stays focused, so switching back to arrows keeps working seamlessly. - When a navigation key is pressed and nothing is focused at all (e.g. mouse was already on empty space when the menu opened), grab the first focusable element instead of silently dropping the input. The keypress is consumed to avoid jumping two elements at once. This makes mixed mouse+keyboard navigation feel a lot more natural. You can point at a button, then continue with arrows, or just start pressing arrows right away without having to hover first. * Overhaul mouse support and generalize direct text editing to all UI scenes This is a large rework of the Windows64 KBM (keyboard+mouse) input layer. It touches the mouse hover system, the mouse click dispatch, and the direct text editing infrastructure, then applies all of it to every scene that has text input fields or non-standard clickable elements. MOUSE HOVER REWRITE (UIController.cpp tickInput) The old hover code had two structural problems: (a) Scene lookup was group-first: it iterated UI groups and checked all layers within each group. The Tooltips layer on eUIGroup_Fullscreen (which holds non-interactive overlays like button hints) would be found before in-game menus on eUIGroup_Player1. The tooltip scene focusable objects captured mouse input and prevented hover from reaching the actual menu. Fixed by switching to layer-first lookup across all groups, and skipping eUILayer_Tooltips entirely since those are never interactive. (b) On tabbed menus (LaunchMoreOptionsMenu Game vs World tabs), all controls from all tabs are registered in Flash at the same time. There was no filtering, so controls from inactive tabs had phantom hitboxes that overlapped the active tab controls, making certain buttons unhoverable. Fixed by introducing parent panel tracking: each UIControl now has a m_pParentPanel pointer, set automatically by the UI_MAP_ELEMENT macro during mapElementsAndNames(). The hover code checks the control parent panel against the scene GetMainPanel() and skips mismatches. This is the same technique the Vita touch code used, but applied to mouse hover. The coordinate conversion was also simplified. The old code had two separate scaling paths (window dimensions for hover, display dimensions for sliders). Now there is one conversion from window pixel coords to SWF coords using the scene own render dimensions. REUSING VITA TOUCH APIs FOR MOUSE (ButtonList, UIScene) Several APIs originally gated behind __PSVITA__ are now enabled for Win64: - UIControl_ButtonList::SetTouchFocus(x,y) and CanTouchTrigger(x,y): the Flash-side ActionScript methods were already registered on all platforms in setupControl(), only the C++ wrappers were ifdef-gated. Opening the ifdefs to include _WINDOWS64 lets the mouse hover code delegate to Flash for list item highlighting, which handles internal scrolling and item layout that would be impractical to replicate in C++. - UIScene::SetFocusToElement(id): programmatic focus-by-control-ID, used as a fallback when Iggy focusable objects do not match the C++ hit test. - UIScene_LaunchMoreOptionsMenu::GetMainPanel(): returns the active tab panel control, needed by the hover code to filter inactive tab controls. MOUSE CLICK DISPATCH (UIScene.cpp handleMouseClick) Left-clicking previously relied entirely on Iggy ACTION_MENU_OK dispatch, which routes to whatever Flash considers focused. This broke for custom- drawn elements that are not Flash buttons (crafting recipe slots), and for scenes where Iggy focus did not match what the user visually clicked. Added a virtual handleMouseClick(x, y) on UIScene with a default implementation that hit-tests C++ controls. When multiple controls report overlapping bounds (common in debug scenes where TextInputs report full Flash-width), it picks the one whose left edge X is closest to the click. Returns true to consume the click and suppress the normal ACTION_MENU_A dispatch via a m_mouseClickConsumedByScene flag on UIController. The default implementation handles buttons, text inputs, and checkboxes (toggling state and calling handleCheckboxToggled directly). CRAFTING MENU MOUSE CLICK (UIScene_CraftingMenu.cpp) The crafting menu recipe slots (H slots) are rendered through Iggy custom draw callback, not as Flash buttons. They have no focusable objects, so mouse clicking did nothing. The solution caches SWF-space positions during rendering: inside customDraw, when H slot 0 and H slot 1 are drawn, the code extracts SWF coordinates from the D3D11 transform matrix via gdraw_D3D11_CalculateCustomDraw_4J. The X difference between slot 0 and slot 1 gives the uniform slot spacing. handleMouseClick then uses these cached bounds to determine which recipe slot was clicked, resets the vertical slot indices (same pattern as the constructor), updates the highlight and vertical slots display, and re-shows the old slot icon. This mirrors the existing controller LEFT/RIGHT navigation in the base class handleKeyDown. DIRECT EDIT REFACTORING (UIControl_TextInput) The direct text editing feature (type directly into text fields instead of opening the virtual keyboard) was originally implemented inline in CreateWorldMenu with all the state, character consumption, cursor tracking, caret visibility, and cooldown logic hardcoded in one scene. Moved everything into UIControl_TextInput: - beginDirectEdit(charLimit): captures current label, inits cursor at end - tickDirectEdit(): consumes chars, handles Backspace/Enter/Escape, arrow keys (Left/Right/Home/End/Delete), enforces caret visibility every tick (because setLabel and Flash focus transitions continuously reset it), returns Confirmed/Cancelled/Continue - cancelDirectEdit() / confirmDirectEdit(): programmatic control - isDirectEditing() / getDirectEditCooldown() / getEditBuffer(): state query For SWFs that lack the m_mcCaret MovieClip child (like AnvilMenu), the existence check validates by reading a property from the resolved path, since IggyValuePathMakeNameRef always succeeds even for undefined refs. When no caret exists, the control inserts a _ character at the cursor position as a visual fallback. The caret check result is cached in m_bHasCaret/m_bCaretChecked to avoid repeated Iggy calls that could corrupt internal state. SCENES UPDATED WITH DIRECT EDIT + VIRTUAL KEYBOARD Every scene with text input now supports both input modes: direct editing when KBM is active, virtual keyboard (via NavigateToScene eUIScene_Keyboard) when using a controller. The mode is chosen at press time based on g_KBMInput.IsKBMActive(). - CreateWorldMenu: refactored to use the new UIControl_TextInput API, removing ~80 lines of inline editing code. - AnvilMenu: item renaming now supports direct edit. The keyboard callback uses Win64_GetKeyboardText instead of InputManager.GetText (which reads from a different buffer on Win64). The virtual keyboard is opened with eUILayer_Fullscreen + eUIGroup_Fullscreen so it does not hide the anvil container menu underneath. Added null guards on getMovie() in setCostLabel and showCross since the AnvilMenu SWF may not fully load on Win64. - SignEntryMenu: all 4 sign lines support direct edit. Clicking a different line while editing confirms the current one. Each line cooldown timer is checked independently to prevent Enter from re-opening the edit. - LaunchMoreOptionsMenu: seed field direct edit with proper input blocking. - DebugCreateSchematic: all 7 text inputs (name + start/end XYZ coords). handleMouseClick is overridden to always consume clicks during edit to prevent Iggy re-entry on empty space. - DebugSetCamera: all 5 inputs (camera XYZ + Y rotation + elevation). Clicking a different field while editing confirms the current value and opens the new one. Float display formatting changed from %f to %.2f. All keyboard completion callbacks on Win64 now use Win64_GetKeyboardText (two params: buffer + size) instead of InputManager.GetText, which reads from the correct g_Win64KeyboardResult global when using the in-game keyboard scene. SCROLL WHEEL Mouse wheel events (ACTION_MENU_OTHER_STICK_UP/DOWN) are now centrally remapped to ACTION_MENU_UP/DOWN in UIController::handleKeyPress when KBM is active. Previously each scene would need to handle OTHER_STICK actions separately, and most did not, so scroll wheel only worked in a few places. * Add mouse click support to CraftingMenu (tab switching, slot selection, craft) The crafting screen's horizontal recipe slots and category tabs are custom-drawn via Iggy callbacks rather than regular Flash buttons, so the standard mouse hover system can't interact with them. This adds handleMouseClick to derive clickable regions from the H slot positions cached during customDraw. Tab clicking: tab hitboxes are computed relative to the H slot row since the Vita TouchPanel overlays (full-screen invisible rectangles) aren't suitable for direct hit-testing on Win64. The Y bounds were tuned empirically to match the SWF tab icon positions. Clicking a tab runs the same switch logic as LB/RB: hide old highlight, update group index, reset slot indices, recalculate recipes, and refresh the display. H slot clicking: clicking a different recipe slot selects it (updating V slots, highlight, and re-showing the previous slot). Clicking the already-selected slot crafts the item by dispatching ACTION_MENU_A through handleKeyDown, reusing the existing crafting path. Empty slots (iCount == 0) are ignored. All mouse clicks on the scene are consumed (return true) to prevent misses from falling through as ACTION_MENU_A and accidentally triggering a craft. This only suppresses mouse-originated A presses via m_mouseClickConsumedByScene; keyboard and controller A remain fully functional. Also enables GetMainPanel for Win64 (was Vita-only) so the mouse hover system can filter controls by active panel, same as other tabbed menus. * Fix mouse hover selecting wrong buttons from the third onward The hover code was doing a redundant second hit-test against Iggy focusable object bounds after the C++ control bounds had already identified the correct control. Iggy focusable bounds are wider than the actual visible buttons and overlap vertically, so the "pick largest x0" heuristic would match focusables belonging to earlier buttons when hovering the right side of buttons 3+. Replaced the IggyPlayerGetFocusableObjects path with a direct SetFocusToElement call using the already-correct hitControlId from the C++ hit-test, same approach the click path uses in handleMouseClick. Also switched the overlap tiebreaker from "largest x0" to smallest area, consistent with how clicks resolve overlapping controls. TextInput is excluded from hover focus to avoid showing the caret on mere mouse-over (its Iggy focus is set on click). * Use smallest-area tiebreaker for mouse click hit-testing too Same overlap fix applied to handleMouseClick: when multiple controls contain the click point, prefer the one with the smallest bounding area instead of the one with the largest left-edge X. This is more robust for any layout (vertical menus, grids, overlapping panels) and matches the hover path logic. Those changes were initially made in order to fix the teleport ui for the mouse but broke every other well working ui. * Fix mouse cursor staying trapped in window on alt-tab When the inventory or other UI with a hidden cursor was open, alt-tabbing out would leave the cursor locked to the game window. SetWindowFocused(false) from WM_KILLFOCUS correctly released the clip and showed the cursor, but Tick() was unconditionally calling SetCursorPos every frame to re-center it, overriding the release. Added m_windowFocused to the Tick() condition so cursor manipulation only happens while the window actually has focus. * Map mouse right click to ACTION_MENU_X for inventory half-stack Right clicking an item stack in Java Edition picks up half of it. Console Edition already handles this via ACTION_MENU_X (the X button on controller), which sets buttonNum=1 in handleKeyDown. This maps mouse right click to that same action so KBM players get the same behavior across all container menus (inventory, chests, furnaces, hoppers, etc). * Fix mouse hover hitting removed controls (ghost hitboxes) When removeControl() removes a Flash element (e.g. the Reinstall button in Help & Options, or the Debug button when disabled), the C++ control object stays in the m_controls vector. On Vita this was handled by calling setHidden(true) and checking getHidden() in the touch hit-test, but on Windows64 none of that was happening. The result: removed buttons kept phantom bounds that the hover code would match against, stealing focus from the buttons that shifted into their visual position. In the Help & Options menu with debug enabled, the removed Reinstall button (Button6) had ghost bounds overlapping where the Debug button (Button7) moved to after the removal, making Debug un-hoverable and snapping focus to Button1. The fix has three parts: - removeControl() now calls setHidden(true) on all platforms, not just Vita. The m_bHidden member was already declared on all platforms, only the accessors were ifdef'd behind __PSVITA__. - Removed the __PSVITA__ ifdef from setHidden/getHidden in UIControl.h so they're available everywhere. - Added getHidden() checks in both the hover and click hit-test loops, matching what the Vita touch code already does. The check is a simple bool read (no Flash/Iggy call), placed before the getVisible() query which hits Flash and can return stale values for removed elements. * Add right-click to open save options in world selection menu On controller, RB (ACTION_MENU_RIGHT_SCROLL) opens the save options dialog (rename/delete) when a save is selected. Mouse right-click maps to ACTION_MENU_X, which had no Windows64 handler in this scene. Added save options handling under ACTION_MENU_X for _WINDOWS64 so right-clicking a save opens the same dialog. Also handles the mashup world hide action for right-click consistency. Console-only options (copy save, save transfer) are excluded since they don't apply here. * Fix splitscreen mouse, keyboard cursor, and local player join Mouse hover and click in split-screen was broken: the coordinate conversion from window pixels to Flash/SWF space did not account for the viewport tile-origin offset or the smaller display dimensions of each splitscreen quadrant. Now the mouse position is mapped through three steps: window pixels to UIController screen space, subtract the viewport origin (which varies per quadrant/split type), then scale from display size to SWF authoring size. This fixes hover highlighting and click targeting in all splitscreen layouts. Mouse input was also bleeding into other splitscreen players' UI groups because the scene lookup iterated all groups. Now it only checks the fullscreen group and the primary (KBM) player's group, so controller players' menus are never affected by mouse movement. Mouse grab/release (cursor lock for gameplay) was triggering for every local player's tick, causing fights between splitscreen players over the cursor state. Now only the primary pad player controls grab state. The in-game keyboard scene in PC mode had no cursor movement: typing always appended at the end and backspace always deleted from the end. Added a cursor position tracker (m_iCursorPos) so that characters are inserted at the cursor, backspace deletes behind it, and arrow keys, Home, End, and Delete all work as expected. The Flash caret is synced to the cursor position each tick. Also stopped syncing the text buffer back from Flash in PC mode, which was resetting the cursor every tick. Arrow keys in PC mode no longer get forwarded to Flash (which would move the on-screen keyboard selector instead of the text cursor). AddLocalPlayerByUserIndex was calling NotifyPlayerJoined before the IQNet slot was actually registered, passing a pointer obtained via GetLocalPlayerByUserIndex which checks customData (not set yet at that point). Now AddLocalPlayerByUserIndex is called first, and if it succeeds, the notification uses the static m_player array directly. The stub AddLocalPlayerByUserIndex now properly initialises the slot with gamertag and remote/host flags instead of being a no-op. IsSignedIn was hardcoded to return true only for pad 0, preventing splitscreen players from joining. Now it checks IsPadConnected so any connected controller can sign in. GetXUID returned INVALID_XUID for all pads except 0, which broke splitscreen player identity. Now each pad gets a unique XUID derived from the base value plus the pad index. Pinned internal resolution to 1920x1080 and removed GetSystemMetrics auto-detection which was picking up the native monitor resolution and breaking the 16:9 assumption in the viewport math and Flash layout. DPI awareness is kept for consistent pixel coordinates. * Fix Escape key not opening pause menu during tutorial hints The KBM pause check had a IsTutorialVisible guard that blocked Escape entirely while any tutorial popup was on screen. The controller path never had this restriction. Removed the check so Escape behaves the same as Start on controller. * Fix crash in WriteHeader when save buffer is too small for header table When a player enters a new region, RegionFile's constructor calls createFile which adds a FileEntry with length 0 to the file table. This increases the header table size (appended at the end of the save buffer) by sizeof(FileEntrySaveData) per entry, but since no actual data is written to the file, MoveDataBeyond is never called and the committed virtual memory pages are never grown to match. On the next autosave tick, saveLevelData writes level.dat first (before chunkSource->save which would have grown the buffer). If level.dat doesn't need to grow, finalizeWrite calls WriteHeader which tries to memcpy the now-larger header table past the end of committed memory, causing an access violation. This is especially likely in splitscreen where two players exploring at the same time can create multiple new RegionFile entries within a single tick, quickly exhausting the page-alignment slack in the buffer (yes i am working at splitscreen in the meanwhile :) ) The fix was deduced by tracing the crash callstack through the save system: FileHeader, ConsoleSaveFileOriginal, the stream chain, and the RegionFile/RegionFileCache layer. The root cause turned out to be a gap between createFile (which grows the header table) and MoveDataBeyond (the only place that grows the buffer), with finalizeWrite sitting right in between unprotected. The buffer growth check added here mirrors the exact same VirtualAlloc pattern already used in MoveDataBeyond (line 484-497) and in the constructor's decompression path (line 176-190), so it integrates naturally with the existing code. Same types, same page rounding, same error handling. The fast path (no new entries, buffer already big enough) is a single DWORD comparison that doesn't get taken, so there is zero overhead in the common case. This is the right place for the fix because finalizeWrite is the sole caller of WriteHeader, meaning every code path that writes the header (closeHandle, PrepareForWrite, deleteFile, Flush) is now protected by a single check point. * Fix TextInput bugs and refactor direct edit handling into UIScene base class The fake cursor character (_) used for SWFs without m_mcCaret was leaking into saved sign and anvil text. This happened because setLabel() with instant=false only updates the C++-side cache, deferring the Flash write to the next control tick. Any getLabel() call before that tick reads the old Flash value still containing the underscore. Fixed by passing instant=true in confirmDirectEdit, cancelDirectEdit, and the Enter key path inside tickDirectEdit, so the cleaned text hits Flash immediately. Mouse hover over TextInput controls (world name, anvil name, seed field) was not showing the yellow highlight border. The hover code used IggyPlayerSetFocusRS which sets Iggy's internal dispatch focus but does not trigger Flash's ChangeState callback, so no visual feedback appeared. Buttons worked fine because Iggy draws its own focus ring on them, but TextInput relies entirely on ChangeState(0) for the yellow border. Switched to SetFocusToElement which goes through the Flash-side SetFocus path, then immediately call setCaretVisible(false) to suppress the blinking caret that comes with focus. No visual flicker since rendering happens after both tickInput and scene tick complete. While direct editing, mouse hover was able to move focus away to other TextInputs on the same scene (most noticeably on the sign editor, where hovering a different line would steal focus from the line being typed). Added an isDirectEditBlocking() check in the hover path to skip focus changes when any input on the scene is actively being edited. The Done button in SignEntryMenu was unresponsive to mouse clicks during direct editing. The root cause is execution order: handleMouseClick runs before handleInput in the frame. The base handleMouseClick found the Done button and called handlePress, but handlePress bailed out because of the isDirectEditing guard. The click was marked consumed, so handleInput never saw it. Fixed by overriding handleMouseClick in SignEntryMenu to detect the Done button hit while editing and confirm + close directly. Added click-outside-to-deselect for anvil and world name text inputs. Both scenes previously required Enter to confirm the edit, which felt wrong. Now clicking anywhere outside the text field bounds confirms the current text, matching standard UI behavior. The anvil menu now updates the item name in real time while typing, like Java edition. Previously the name was only applied on Enter, so the repair cost display was stale until confirmation. The biggest change is structural: every scene that used direct editing (AnvilMenu, CreateWorldMenu, SignEntryMenu, LaunchMoreOptionsMenu, DebugCreateSchematic, DebugSetCamera) had its own copy of the same boilerplate -- tickDirectEdit loops in tick(), click-outside hit testing in handleMouseClick(), cooldown guard checks in handleInput/handlePress, and result dispatch with switch/if chains. This was around 200 lines of near-identical code scattered across 6 files, each with its own slight variations and its own bugs waiting to happen. Pulled all of it into UIScene with two virtual methods: getDirectEditInputs() where scenes register their text inputs, and onDirectEditFinished() where they handle confirmed/cancelled results. The base class tick() drives tickDirectEdit on all registered inputs, handleMouseClick() does the click-outside-to-deselect hit test generically using panel offsets, and isDirectEditBlocking() replaces all the inline cooldown checks. Scenes now just override those two methods and get everything for free. Also removed the m_activeDirectEditControl enum tracking from the debug scenes (DebugCreateSchematic, DebugSetCamera) since the base class handles lifecycle tracking through the controls themselves. * Remap scroll wheel to LEFT/RIGHT for horizontal controls The scroll wheel was always remapped to UP/DOWN, which is fine for vertical lists but useless on horizontal controls like sliders and the texture pack selector. Track whether the mouse is hovering a horizontal control during the hover hit-test (new bool m_bMouseHoverHorizontalList, set for eTexturePackList and eSlider). When the flag is set, handleKeyPress emits LEFT/RIGHT instead of UP/DOWN for wheel events. TexturePackList is also now part of the mouse hover system with proper hit-testing, relative-coord SetTouchFocus and GetRealHeight for accurate bounds. * Guard setCaretVisible and setCaretIndex against null movie tickDirectEdit calls into Iggy every tick without checking if the movie is still valid, which crashes inside iggy_w64.dll when the Flash movie gets unloaded or isn't ready yet. * Fix creative inventory scroll for both mouse wheel and controller The mouse scroll wheel was not working in the creative inventory at all. UIController remaps wheel input from OTHER_STICK to UP/DOWN for KBM users, but the base container menu handler consumed UP/DOWN for grid navigation before it could reach the creative menu's page scrolling logic in handleAdditionalKeyPress. Fixed by detecting scroll wheel input on UP/DOWN in the base handler and forwarding it as OTHER_STICK to handleAdditionalKeyPress instead. Also fixed the controller right stick scrolling way too fast: it was jumping TabSpec::rows (5) rows per tick at 100ms repeat rate, which blew through the entire item list almost instantly. Reduced to 1 row per tick so scrolling feels controlled on both input methods. * Fix split-screen world rendering aspect ratio gluPerspective was hardcoded to use g_iAspectRatio (always 16:9) instead of the aspect parameter from getFovAndAspect, which adjusts for split-screen viewports. The 3D world was horizontally stretched in top/bottom split because the projection used 16:9 while the viewport was 32:9. * Split-screen UI system with full ultrawide and multi-aspect-ratio support Screen resolution is now auto-detected from the monitor at startup instead of being hardcoded to 1920x1080. This fixes rendering on ultrawide (21:9), super-ultrawide (32:9), 16:10, and any other aspect ratio -- both in singleplayer and split-screen multiplayer. The 3D world renders at native resolution so the full monitor is used. Flash UI is 16:9-fitted and centered inside each viewport, pillarboxed on wide displays and letterboxed on tall ones. Logical game dimensions (used for ortho projection and HUD layout) are computed proportionally from the real screen aspect ratio, fixing the stretched world projection and HUD that the old hardcoded 1280x720 caused on non-16:9 monitors. GameRenderer::ComputeViewportForPlayer uses the actual backbuffer size instead of the logical game size, which was causing split-screen viewports to be sized incorrectly. UIScene::render fits menus to 16:9 within each split viewport using GetViewportRect + Fit16x9, keeping inventory/crafting/options screens at their designed aspect ratio instead of stretching. Panorama and MenuBackground render at full viewport size with proper tile scaling so the background fills the entire area without gaps in vertical split and quadrant layouts. HUD tile rendering uses ComputeTileScale to uniformly scale the SWF and show the bottom portion (hotbar, hearts, hunger) in horizontal and quadrant splits. repositionHud passes visible SWF-space dimensions to ActionScript for proper element centering within each viewport. Chat and Tooltips overlays use ComputeTileScale and ComputeSplitContentOffset to anchor correctly to the bottom of each player's viewport tile. Container menus apply Fit16x9 to pointer coordinate mapping so the cursor tracks correctly in split-screen. getMouseToSWFScale moved out of the header into the .cpp. Mouse input in onMouseTick is gated to pad 0 since raw mouse deltas should only drive player 1. All shared viewport math lives in UISplitScreenHelpers.h: - GetViewportRect: origin and dimensions for any viewport type - Fit16x9: aspect-correct fitting with centering offsets - ComputeTileScale: uniform scale and Y-offset for tile rendering - ComputeSplitContentOffset: content centering for overlay components * Fix XUID assignment for split-screen local players Main's XUID refactor returned INVALID_XUID for pad != 0, which breaks split-screen because each local player needs a distinct identity for the save system and per-player inventory data. Now pad 1-3 get unique XUIDs derived from the legacy embedded base (base + iPad), same as the original console behavior. Only pad 0 uses the persistent uid.dat-backed XUID for networking. * Use persistent XUID for all pads in GetXUID All pads now get unique XUIDs derived from the persistent uid.dat value (base + iPad offset). This gives each split-screen player a globally unique identity that works for both local play and online multiplayer. The host legacy XUID override for save compatibility still happens in Minecraft.cpp after GetXUID is called, so old worlds are unaffected. * Split-screen networking, window resize, bitmap font fix, and multiplayer stability Adds the networking layer for non-host split-screen multiplayer, implements live window resize with swap chain recreation, fixes bitmap font scaling at small window sizes, and fixes several crash-causing bugs in the multiplayer stack (compression buffer overflow, TCP stream desync, chunk visibility race, CompressedTileStorage torn reads, reconnect stability). == Non-host split-screen multiplayer == Each split-screen pad on a non-host client opens its own TCP connection to the host. From the host's perspective each connection looks like a normal remote player (gets its own smallId, Socket, PlayerConnection). WinsockNetLayer: JoinSplitScreen(), CloseSplitScreenConnection(), SplitScreenRecvThreadProc, per-pad socket/thread/smallId tracking (s_splitScreenSocket[], s_splitScreenSmallId[], s_splitScreenRecvThread[]). GetLocalSocket() returns the correct TCP socket for a given local sender's smallId. GetSplitScreenSmallId() returns the host-assigned smallId for a pad. GameNetworkManager::CreateSocket: non-host path (localPlayer && !IsHost() && IsInGameplay()) calls JoinSplitScreen, sets the IQNet slot's smallId and resolvedXuid, creates a non-hostLocal Socket + ClientConnection, sends PreLoginPacket, registers via addPendingLocalConnection. PlatformNetworkManagerStub::RemoveLocalPlayerByUserIndex: implemented the formerly-empty stub. Calls NotifyPlayerLeaving, CloseSplitScreenConnection, and clears the IQNet slot fields so the pad can rejoin cleanly. SmallId pool: s_nextSmallId starts at XUSER_MAX_COUNT (4), reserving m_player[0-3] for local pads so remote players never collide. IQNetPlayer::SendData: non-host local senders now route through GetLocalSocket(m_smallId) instead of always using SendToSmallId. IQNet::GetLocalPlayerByUserIndex: rewritten. Pad 0 on non-host uses GetLocalSmallId() for direct lookup; pads 1-3 check m_player[padIdx]. C_4JProfile::IsSignedIn: pad 0 always returns true (was checking controller connection, which is unreliable on Win64). GetGamertag/GetDisplayName: for pads 1-3 with active local players, returns the pad-specific gamertag from IQNet::m_player instead of always returning the primary username. ClientConnection: isPrimaryConnection() (true on host or for the primary pad on non-host) guards relative-delta and world-modifying handlers to prevent double-processing of shared state: - Guarded: handleMoveEntity, handleMoveEntitySmall, handleChunkTilesUpdate, handleBlockRegionUpdate, handleTileUpdate, handleTakeItemEntity, handleSignUpdate, handleTileEntityData, handleTileEvent, handleTileDestruction, handleComplexItemData, handleLevelEvent, handleSoundEvent, handleParticleEvent, handleAddGlobalEntity. - handleSetEntityMotion: secondary connections only accept motion targeting their own local player (knockback). - handleExplosion: world modification (finalizeExplosion) guarded, per-player knockback unguarded. Added null check on localplayers[]. - Entity spawn/remove/teleport/data handlers left unguarded (putEntity is idempotent, absolute value setters). handleLogin: added else clause to set level when the dimension already exists (was leaving level NULL on reconnect). handleChunkVisibilityArea/handleChunkVisibility: added null check on level. handleContainerOpen: added null check on localplayers[m_userIndex]. == Reconnect stability == PendingConnection: duplicate XUID no longer rejects with eDisconnect_Banned. Instead it force-disconnects the stale old connection via stalePlayer->connection->disconnect(), queues the old smallId for recycling via queueSmallIdForRecycle(), then calls handleAcceptedLogin for the new connection. MinecraftServer: swapped tick order so players->tick() (disconnect queue) runs before connection->tick() (new logins). The old player is removed from PlayerList before the new LoginPacket's XUID check runs. PlayerList: PushFreeSmallId and ClearSocketForSmallId moved here from DoWork, called only after PlayerConnection::disconnect() completes and the read/write threads are dead. New queueSmallIdForRecycle() method lets PendingConnection push smallIds into m_smallIdsToClose, which PlayerList::tick() processes through closePlayerConnectionBySmallId() for deferred cleanup. Prevents a race where the old write thread could resolve getPlayer() to a recycled smallId's new connection and send stale packets on it. SocketInputStreamLocal::close() and SocketOutputStreamLocal::close() now actually clear their queues (std::swap with empty queue instead of calling .empty() which is a read-only no-op). ServerConnection::stop(): pending and players vectors are snapshot-copied before iterating (prevents iterator invalidation). Remote players receive a DisconnectPacket via disconnect(eDisconnect_Quitting) instead of raw close(). tick(): added else clause so flush() only runs on live connections. WinsockNetLayer::Shutdown(): accept thread stopped first (prevents new recv threads from spawning), then all recv threads are collected and waited on, then connections are closed and split-screen sockets cleaned up. Clears disconnect and free-pool vectors before deleting critical sections. WinsockNetLayer::JoinGame(): waits for old s_clientRecvThread to fully exit before creating a new TCP connection. Prevents the old recv thread from reading bytes off the new socket and desynchronizing the stream. == Compression buffer overflow == CompressLZXRLE and CompressRLE wrote RLE intermediate output into a fixed 100KB buffer with no bounds checking. Full chunk columns are ~160KB and the RLE step can expand 0xFF bytes to 2 bytes each, easily overflowing into rleDecompressBuf and heap metadata. This caused delayed crashes in unrelated code (Packet::readPacket, LevelRenderer::updateDirtyChunks) after the first autosave, since that's when full chunks get compressed. Fix: dynamic allocation when worst-case RLE output (SrcSize * 2) exceeds the static buffer. Static buffer still used for small inputs (zero overhead). CompressRLE: moved LeaveCriticalSection after dynamic buffer cleanup. DecompressLZXRLE: now checks zlib return value (was completely ignored). On failure, bails out immediately with *pDestSize = 0. Added RLE input bounds checking (pucIn >= pucEnd before reading count/data bytes) and output bounds checking (pucOut + count > pucOutEnd). Same bounds checks applied to DecompressRLE. == Stream desync (Connection write thread) == The write thread had two output paths to the same TCP socket: bufferedDos (5KB buffered stream) and direct sos->writeWithFlags(). Chunk data sent via queueSend() used the direct path with shouldDelay=true, while other packets used bufferedDos. If bufferedDos had unflushed bytes, the direct write arrived at the client first, reordering the TCP stream and producing bad packet ID crashes. Fix: flush bufferedDos immediately before every direct sos->writeWithFlags(). == Chunk visibility race (empty first chunk after 30s) == BlockRegionUpdatePacket (direct socket write via queueSend) could arrive at the client before ChunkVisibilityAreaPacket (buffered). The client called getChunk() on a chunk that didn't exist yet in the cache, got EmptyLevelChunk (whose setBlocksAndData is a no-op), and silently lost the block data. On superflat this left one invisible chunk; on normal worlds it crashed the renderer. Fix: handleBlockRegionUpdate calls dimensionLevel->setChunkVisible() for full-chunk BRUPs before writing data, making it independent of packet ordering. Added post-write verification logging. CompressedTileStorage race: get() reads indicesAndData twice without a lock. compress() can swap the pointer between reads, producing indices from the old buffer paired with data from the new buffer. Fix: snapshot indicesAndData into a local variable before deriving both pointers. Same snapshot pattern applied to getData() (non-Vita path), isRenderChunkEmpty(), getHighestNonEmptyY(), getAllocatedSize(), and write(). All methods now also guard against NULL snapshots. == Window resize == ResizeD3D() destroys the old swap chain, creates a new one at the target size, then patches InternalRenderManager members directly via memory offsets (0x20=swap chain, 0x28=RTV, 0x50=SRV, 0x98=DSV, 0x5138/0x513C= backbuffer width/height). Offset verification cross-checks known pointers (device at 0x10, swap chain at 0x20) before patching. Old RTV/SRV are intentionally leaked (orphaned with the old swap chain) to avoid fighting unknown ref holders in the precompiled RenderManager. The flow: Suspend RenderManager, ClearState+Flush, release views, gdraw_D3D11_PreReset, destroy old swap chain, create new swap chain via IDXGIFactory, patch offsets, recreate RTV/SRV/DSV, rebind render targets, update UIController (updateRenderTargets + updateScreenSize), gdraw_D3D11_PostReset + SetRendertargetSize, IggyFlushInstalledFonts, Resume, PostProcesser::Init. WM_SIZE handling defers resize during window drag (WM_ENTERSIZEMOVE/ WM_EXITSIZEMOVE). Immediate resizes (maximize, programmatic) call ResizeD3D directly. Removed the old UpdateAspectRatio() function. CleanupDevice() was leaking g_pDepthStencilView and g_pDepthStencilBuffer. InitDevice: swap chain BufferUsage now includes DXGI_USAGE_SHADER_INPUT (needed for the SRV created from the backbuffer for CaptureThumbnail). New globals: g_rScreenWidth/g_rScreenHeight (real window dimensions, updated on resize) vs g_iScreenWidth/g_iScreenHeight (fixed logical resolution, stays 1920x1080). ComputeViewportForPlayer and getFovAndAspect now use g_rScreenWidth/ g_rScreenHeight instead of the fixed startup values, so 3D perspective and split-screen viewports adapt to window size. Main loop: rendering skipped when window is minimized (IsIconic check) to avoid 100% GPU usage on a hidden swap chain. Windows64_UIController: new updateRenderTargets(rtv, dsv) method updates cached D3D pointers used by gdraw_D3D11_SetTileOrigin every frame. UIController.h: new inline updateScreenSize(w, h) sets m_fScreenWidth/ m_fScreenHeight so all downstream UI code picks up the new size. == Bitmap font scaling == At small window sizes, dynamic text (scrollable list items, HowToPlay pages) showed overlapping characters. Static SWF text was unaffected because it uses embedded vector glyphs. Root cause in UIBitmapFont.cpp GetGlyphBitmap: when display scale is smaller than the bitmap's native scale (pixel_scale < truePixelScale, glyphScale stays at 1), Iggy displayed the glyph at native 1:1 pixel size but advanced the cursor by the smaller display-scale amount. At intermediate window sizes (e.g. 1678x756, scale factor ~0.7), a second bug appeared: some SWF font sizes produced pixel_scale just above truePixelScale (13 for Mojangles_11) while others fell just below, splitting glyphs across the small-display and normal cache branches. The normal branch cached all glyphs in a single [truePixelScale, 99] range, so the first glyph cached set pixel_scale_correct for every subsequent request regardless of font size. Different font sizes then got scaled by wrong ratios (e.g. 18.9/13.3 = 1.42x with point sampling), producing visibly inconsistent letter sizes. This only happened at specific window sizes where the display scale put some fonts above and others below the truePixelScale boundary. Full 1080p and very small windows were unaffected because all fonts landed in the same branch. Fix: on _WINDOWS64, always use pixel_scale_correct = truePixelScale so every cache entry is consistent regardless of which font size creates it first. Two cache ranges: downscale (pixel_scale < truePixelScale) uses bilinear for smooth reduction, upscale uses point_sample for crisp pixel-art rendering. At most two cache entries per glyph. The console code path (fixed resolution, integer-multiple scaling) is preserved behind #else. UIScene.cpp loadMovie: always load 1080.swf on _WINDOWS64 regardless of window size. The old height-based selection could pick 480 or 720 variants which either crashed or loaded the wrong skin library (skinHD.swf vs skin.swf). Display size is now set via Fit16x9 BEFORE the init tick so Iggy's ActionScript text field creation sees the same scale that render() will use. IggyFlushInstalledFonts() called after init tick to clear stale glyph cache entries from previous scenes. Font.cpp addCharacterQuad/renderCharacter: yOff was computed with m_charWidth instead of m_charHeight, producing wrong texture coordinates for non-square glyph cells. This is the world-rendering font (chat, signs, name tags), not the Iggy UI font. == XUID generation == Split-screen pad XUIDs derived by hashing baseXuid + iPad through Mix64 (DeriveXuidForPad in Windows64_Xuid.h) instead of simple addition. Pad 0 returns the base XUID unchanged for save compatibility. Includes validity fallbacks if the hash produces an invalid XUID. (Suggested by rtm516) == Misc == Packet::readPacket: thread-local ring buffer tracks last 8 good packet IDs. On bad packet ID, dumps the history plus next 32 bytes of stream for diagnosing TCP desynchronization. PendingConnection/PlayerList: debug logging for the reconnect flow (duplicate XUID handling, force-disconnect, handleAcceptedLogin, placeNewPlayer with smallId/entityId/dimension). ClientConnection::handleBlockRegionUpdate: warning log when a full chunk arrives with ys==0 (empty full chunk, data loss indicator). == Known issues / future work == SendOnSocket global lock (WinsockNetLayer.cpp): s_sendLock is a single CriticalSection serializing ALL TCP sends across ALL connections. If one client's send() blocks (TCP window full, slow network), every other write thread stalls — no data flows to any player until the slow send completes. Each PlayerConnection has its own write thread, so with 8+ players one slow client can cause latency spikes or timeout disconnects for healthy players. Fix: replace s_sendLock with per-socket locks indexed by smallId. The lock only needs to prevent header+payload interleaving on the SAME socket; sends to different sockets are independent. Deferred to a separate PR to keep this one focused. Textures::releaseTexture: early return for id <= 0, checks TextureGetTexture(id) != NULL before calling glDeleteTextures. Prevents crashes on stale texture IDs after RenderManager reset. UIController TextureSubstitutionDestroyCallback: null guard on Minecraft::GetInstance() and mc->textures before calling releaseTexture. Prevents crash during shutdown. StringTable: removed __debugbreak() on language load failure in debug builds.
2506 lines
84 KiB
C++
2506 lines
84 KiB
C++
// gdraw_d3d1x_shared.inl - author: Fabian Giesen - copyright 2012 RAD Game Tools
|
|
//
|
|
// This file implements the part of the Iggy graphics driver layer shared between
|
|
// D3D10 and 11 (which is most of it). It heavily depends on a bunch of typedefs,
|
|
// #defines and utility functions that need to be set up correctly for the D3D version
|
|
// being targeted. This is a bit ugly, but much easier to maintain than the original
|
|
// solution, where we just kept two almost identical versions of this code.
|
|
|
|
// That native handle type holds resource handles and a coarse description.
|
|
typedef union {
|
|
// handle that is a texture
|
|
struct {
|
|
ID3D1X(Texture2D) *d3d;
|
|
ID3D1X(ShaderResourceView) *d3d_view;
|
|
ID3D1X(RenderTargetView) *d3d_rtview;
|
|
U32 w, h;
|
|
} tex;
|
|
|
|
// handle that is a vertex buffer
|
|
struct {
|
|
ID3D1X(Buffer) *verts;
|
|
ID3D1X(Buffer) *inds;
|
|
} vbuf;
|
|
} GDrawNativeHandle;
|
|
|
|
#define GDRAW_NO_STREAMING_MIPGEN // This renderer doesn't use GDraw-internal mipmap generation
|
|
#include "gdraw_shared.inl"
|
|
|
|
// max rendertarget stack depth. this depends on the extent to which you
|
|
// use filters and non-standard blend modes, and how nested they are.
|
|
#define MAX_RENDER_STACK_DEPTH 8 // Iggy is hardcoded to a limit of 16... probably 1-3 is realistic
|
|
#define AATEX_SAMPLER 7 // sampler that aa_tex gets set in
|
|
#define STENCIL_STATE_CACHE_SIZE 32 // number of distinct stencil states we cache DepthStencilStates for
|
|
#define QUAD_IB_COUNT 2048 // quad index buffer has indices for this many quads
|
|
|
|
#define ASSERT_COUNT(a,b) ((a) == (b) ? (b) : -1)
|
|
|
|
static GDrawFunctions gdraw_funcs;
|
|
|
|
// render target state
|
|
typedef struct
|
|
{
|
|
GDrawHandle *color_buffer;
|
|
S32 base_x, base_y, width, height;
|
|
U32 flags;
|
|
rrbool cached;
|
|
} GDrawFramebufferState;
|
|
|
|
struct ProgramWithCachedVariableLocations
|
|
{
|
|
DWORD *bytecode;
|
|
union {
|
|
DWORD size;
|
|
ID3D1X(PixelShader) *pshader;
|
|
ID3D1X(VertexShader) *vshader;
|
|
};
|
|
};
|
|
|
|
struct DynBuffer
|
|
{
|
|
ID3D1X(Buffer) *buffer;
|
|
U32 size; // size of buffer
|
|
U32 write_pos; // start of most recently allocated chunk
|
|
U32 alloc_pos; // end of most recently allocated chunk (=start of next allocation)
|
|
};
|
|
|
|
///////////////////////////////////////////////////////////////////////////////
|
|
//
|
|
// GDraw data structure
|
|
//
|
|
//
|
|
// This is the primary rendering abstraction, which hides all
|
|
// the platform-specific rendering behavior from Iggy. It is
|
|
// full of platform-specific graphics state, and also general
|
|
// graphics state so that it doesn't have to callback into Iggy
|
|
// to get at that graphics state.
|
|
|
|
typedef struct
|
|
{
|
|
ID3D1XDevice *d3d_device;
|
|
ID3D1XContext *d3d_context;
|
|
|
|
// fragment shaders
|
|
ProgramWithCachedVariableLocations fprog[GDRAW_TEXTURE__count][3];
|
|
ProgramWithCachedVariableLocations exceptional_blend[GDRAW_BLENDSPECIAL__count];
|
|
ProgramWithCachedVariableLocations filter_prog[2][16];
|
|
ProgramWithCachedVariableLocations blur_prog[MAX_TAPS+1];
|
|
ProgramWithCachedVariableLocations colormatrix;
|
|
ProgramWithCachedVariableLocations clear_ps;
|
|
|
|
// vertex input layouts
|
|
ID3D1X(InputLayout) *inlayout[GDRAW_vformat__count];
|
|
|
|
// vertex shaders
|
|
ProgramWithCachedVariableLocations vert[GDRAW_vformat__count]; // [format]
|
|
|
|
// render targets
|
|
GDrawHandleCache rendertargets;
|
|
GDrawHandle rendertarget_handles[MAX_RENDER_STACK_DEPTH]; // not -1, because we use +1 to initialize
|
|
|
|
gswf_recti rt_valid[MAX_RENDER_STACK_DEPTH+1]; // valid rect for texture clamping
|
|
|
|
// size of framebuffer-sized texture used for implementing blend modes
|
|
S32 frametex_width, frametex_height;
|
|
|
|
// viewport setting (in pixels) for current frame
|
|
S32 vx,vy;
|
|
S32 fw,fh; // full width/height of virtual display
|
|
S32 tw,th; // actual width/height of current tile
|
|
S32 tpw,tph; // width/height of padded version of tile
|
|
|
|
S32 tx0,ty0;
|
|
S32 tx0p,ty0p;
|
|
rrbool in_blur;
|
|
struct {
|
|
S32 x,y,w,h;
|
|
} cview; // current viewport
|
|
|
|
F32 projection[4]; // scalex,scaley,transx,transy
|
|
F32 projmat[3][4];
|
|
F32 xform_3d[3][4];
|
|
rrbool use_3d;
|
|
|
|
ID3D1X(RenderTargetView) *main_framebuffer;
|
|
ID3D1X(DepthStencilView) *depth_buffer[2]; // 0=main, 1=rendertarget
|
|
ID3D1X(ShaderResourceView) *main_resolve_target;
|
|
rrbool main_msaa; // does main framebuffer have MSAA enabled?
|
|
|
|
ID3D1X(Texture2D) *rt_depth_buffer;
|
|
ID3D1X(Texture2D) *aa_tex;
|
|
ID3D1X(ShaderResourceView) *aa_tex_view;
|
|
ID3D1X(Buffer) *quad_ib; // canned quad indices
|
|
|
|
// scale factor converting worldspace to viewspace <0,0>..<w,h>
|
|
F32 world_to_pixel[2];
|
|
|
|
// state objects
|
|
ID3D1X(RasterizerState) *raster_state[2]; // [msaa]
|
|
ID3D1X(SamplerState) *sampler_state[2][GDRAW_WRAP__count]; // [nearest][wrap]
|
|
ID3D1X(BlendState) *blend_state[GDRAW_BLEND__count];
|
|
ID3D1X(BlendState) *blend_no_color_write;
|
|
ID3D1X(DepthStencilState) *depth_state[2][2]; // [set_id][test_id]
|
|
|
|
// stencil state cache
|
|
// SOA so the keys are tightly packed in a few cache lines!
|
|
U32 stencil_cache_key[STENCIL_STATE_CACHE_SIZE];
|
|
ID3D1X(DepthStencilState) *stencil_cache[STENCIL_STATE_CACHE_SIZE];
|
|
U32 stencil_cache_lru[STENCIL_STATE_CACHE_SIZE];
|
|
U32 stencil_cache_now;
|
|
|
|
// constant buffers
|
|
ID3D1X(Buffer) *cb_vertex;
|
|
ID3D1X(Buffer) *cb_ps_common;
|
|
ID3D1X(Buffer) *cb_filter;
|
|
ID3D1X(Buffer) *cb_colormatrix;
|
|
ID3D1X(Buffer) *cb_blur;
|
|
|
|
// streaming buffers for dynamic vertex/index data
|
|
DynBuffer dyn_vb;
|
|
DynBuffer dyn_ib;
|
|
|
|
U32 dyn_maxalloc, last_dyn_maxalloc;
|
|
S32 max_quad_vert_count;
|
|
|
|
// cached state
|
|
U32 scissor_state; // ~0 if unknown, otherwise 0 or 1
|
|
S32 blend_mode; // -1 if unknown, otherwise GDRAW_BLEND_*
|
|
|
|
// render-state stack described above for 'temporary' rendering
|
|
GDrawFramebufferState frame[MAX_RENDER_STACK_DEPTH];
|
|
GDrawFramebufferState *cur;
|
|
|
|
// texture and vertex buffer pools
|
|
GDrawHandleCache *texturecache;
|
|
GDrawHandleCache *vbufcache;
|
|
|
|
// stat tracking
|
|
rrbool frame_done;
|
|
U64 frame_counter;
|
|
|
|
// error handler
|
|
void (__cdecl *error_handler)(HRESULT hr);
|
|
} GDraw;
|
|
|
|
static GDraw *gdraw;
|
|
|
|
static const F32 four_zeros[4] = { 0 }; // used in several places
|
|
|
|
////////////////////////////////////////////////////////////////////////
|
|
//
|
|
// General resource management for both textures and vertex buffers
|
|
//
|
|
|
|
template<typename T>
|
|
static void safe_release(T *&p)
|
|
{
|
|
if (p) {
|
|
p->Release();
|
|
p = NULL;
|
|
}
|
|
}
|
|
|
|
static void report_d3d_error(HRESULT hr, const char *call, const char *context)
|
|
{
|
|
if (hr == E_OUTOFMEMORY)
|
|
IggyGDrawSendWarning(NULL, "GDraw D3D out of memory in %s%s", call, context);
|
|
else
|
|
IggyGDrawSendWarning(NULL, "GDraw D3D error in %s%s: 0x%08x", call, context, hr);
|
|
}
|
|
|
|
static void unbind_resources(void)
|
|
{
|
|
ID3D1XContext *d3d = gdraw->d3d_context;
|
|
|
|
// unset active textures and vertex/index buffers,
|
|
// to make sure there are no dangling refs
|
|
static ID3D1X(ShaderResourceView) *no_views[3] = { 0 };
|
|
ID3D1X(Buffer) *no_vb = NULL;
|
|
UINT no_offs = 0;
|
|
|
|
d3d->PSSetShaderResources(0, 3, no_views);
|
|
d3d->IASetVertexBuffers(0, 1, &no_vb, &no_offs, &no_offs);
|
|
d3d->IASetIndexBuffer(NULL, DXGI_FORMAT_UNKNOWN, 0);
|
|
}
|
|
|
|
static void api_free_resource(GDrawHandle *r)
|
|
{
|
|
unbind_resources();
|
|
if (r->state != GDRAW_HANDLE_STATE_user_owned) {
|
|
if (!r->cache->is_vertex) {
|
|
safe_release(r->handle.tex.d3d_view);
|
|
safe_release(r->handle.tex.d3d_rtview);
|
|
safe_release(r->handle.tex.d3d);
|
|
} else {
|
|
safe_release(r->handle.vbuf.verts);
|
|
safe_release(r->handle.vbuf.inds);
|
|
}
|
|
}
|
|
}
|
|
|
|
static void RADLINK gdraw_UnlockHandles(GDrawStats * /*stats*/)
|
|
{
|
|
gdraw_HandleCacheUnlockAll(gdraw->texturecache);
|
|
gdraw_HandleCacheUnlockAll(gdraw->vbufcache);
|
|
}
|
|
|
|
////////////////////////////////////////////////////////////////////////
|
|
//
|
|
// Dynamic buffer
|
|
//
|
|
|
|
static void *start_write_dyn(DynBuffer *buf, U32 size)
|
|
{
|
|
U8 *ptr = NULL;
|
|
|
|
if (size > buf->size) {
|
|
IggyGDrawSendWarning(NULL, "GDraw dynamic vertex buffer usage of %d bytes in one call larger than buffer size %d", size, buf->size);
|
|
return NULL;
|
|
}
|
|
|
|
// update statistics
|
|
gdraw->dyn_maxalloc = RR_MAX(gdraw->dyn_maxalloc, size);
|
|
|
|
// invariant: current alloc_pos is in [0,size]
|
|
assert(buf->alloc_pos <= buf->size);
|
|
|
|
// wrap around when less than "size" bytes left in buffer
|
|
buf->write_pos = ((buf->size - buf->alloc_pos) < size) ? 0 : buf->alloc_pos;
|
|
|
|
// discard buffer whenever the current write position is 0;
|
|
// done this way so that if a DISCARD Map() were to fail, we would
|
|
// just keep retrying the next time around.
|
|
ptr = (U8 *) map_buffer(gdraw->d3d_context, buf->buffer, buf->write_pos == 0);
|
|
if (ptr) {
|
|
ptr += buf->write_pos; // we return pointer to write position in buffer
|
|
buf->alloc_pos = buf->write_pos + size; // bump alloc position
|
|
assert(buf->alloc_pos <= buf->size); // invariant again
|
|
}
|
|
// if map_buffer fails, it will have sent a warning
|
|
|
|
return ptr;
|
|
}
|
|
|
|
static U32 end_write_dyn(DynBuffer *buf)
|
|
{
|
|
unmap_buffer(gdraw->d3d_context, buf->buffer);
|
|
return buf->write_pos;
|
|
}
|
|
|
|
////////////////////////////////////////////////////////////////////////
|
|
//
|
|
// Stencil state cache
|
|
//
|
|
|
|
static void stencil_state_cache_clear()
|
|
{
|
|
S32 i;
|
|
|
|
for (i=0; i < STENCIL_STATE_CACHE_SIZE; ++i) {
|
|
gdraw->stencil_cache_key[i] = 0;
|
|
safe_release(gdraw->stencil_cache[i]);
|
|
gdraw->stencil_cache_lru[i] = 0;
|
|
}
|
|
|
|
gdraw->stencil_cache_now = 0;
|
|
}
|
|
|
|
static ID3D1X(DepthStencilState) *stencil_state_cache_lookup(rrbool set_id, rrbool test_id, U8 read_mask, U8 write_mask)
|
|
{
|
|
D3D1X_(DEPTH_STENCIL_DESC) desc;
|
|
S32 i, best = 0;
|
|
U32 key = (set_id << 1) | test_id | (read_mask << 8) | (write_mask << 16);
|
|
U32 now, age, highest_age;
|
|
HRESULT hr;
|
|
|
|
// for LRU
|
|
now = ++gdraw->stencil_cache_now;
|
|
|
|
// do we have this in the cache?
|
|
for (i=0; i < STENCIL_STATE_CACHE_SIZE; ++i) {
|
|
if (gdraw->stencil_cache_key[i] == key) {
|
|
gdraw->stencil_cache_lru[i] = now;
|
|
return gdraw->stencil_cache[i];
|
|
}
|
|
}
|
|
|
|
// not in the cache, find the best slot to replace it with (LRU)
|
|
highest_age = 0;
|
|
for (i=0; i < STENCIL_STATE_CACHE_SIZE; ++i) {
|
|
if (!gdraw->stencil_cache[i]) { // unused slot!
|
|
best = i;
|
|
break;
|
|
}
|
|
|
|
age = now - gdraw->stencil_cache_lru[i];
|
|
if (age > highest_age) {
|
|
highest_age = age;
|
|
best = i;
|
|
}
|
|
}
|
|
|
|
// release old depth/stencil state at that position and create new one
|
|
safe_release(gdraw->stencil_cache[best]);
|
|
|
|
gdraw->depth_state[set_id][test_id]->GetDesc(&desc); // reference state
|
|
desc.StencilEnable = TRUE;
|
|
desc.StencilReadMask = read_mask;
|
|
desc.StencilWriteMask = write_mask;
|
|
desc.FrontFace.StencilFailOp = D3D1X_(STENCIL_OP_KEEP);
|
|
desc.FrontFace.StencilDepthFailOp = D3D1X_(STENCIL_OP_KEEP);
|
|
desc.FrontFace.StencilPassOp = D3D1X_(STENCIL_OP_REPLACE);
|
|
desc.FrontFace.StencilFunc = D3D1X_(COMPARISON_EQUAL);
|
|
desc.BackFace.StencilFailOp = D3D1X_(STENCIL_OP_KEEP);
|
|
desc.BackFace.StencilDepthFailOp = D3D1X_(STENCIL_OP_KEEP);
|
|
desc.BackFace.StencilPassOp = D3D1X_(STENCIL_OP_REPLACE);
|
|
desc.BackFace.StencilFunc = D3D1X_(COMPARISON_EQUAL);
|
|
|
|
hr = gdraw->d3d_device->CreateDepthStencilState(&desc, &gdraw->stencil_cache[best]);
|
|
if (FAILED(hr))
|
|
report_d3d_error(hr, "CreateDepthStencilState", "");
|
|
|
|
gdraw->stencil_cache_key[best] = key;
|
|
gdraw->stencil_cache_lru[best] = now;
|
|
return gdraw->stencil_cache[best];
|
|
}
|
|
|
|
////////////////////////////////////////////////////////////////////////
|
|
//
|
|
// Texture creation/updating/deletion
|
|
//
|
|
|
|
extern GDrawTexture *gdraw_D3D1X_(WrappedTextureCreate)(ID3D1X(ShaderResourceView) *tex_view)
|
|
{
|
|
GDrawStats stats={0};
|
|
GDrawHandle *p = gdraw_res_alloc_begin(gdraw->texturecache, 0, &stats); // it may need to free one item to give us a handle
|
|
p->handle.tex.d3d = NULL;
|
|
p->handle.tex.d3d_view = tex_view;
|
|
p->handle.tex.d3d_rtview = NULL;
|
|
p->handle.tex.w = 1;
|
|
p->handle.tex.h = 1;
|
|
gdraw_HandleCacheAllocateEnd(p, 0, NULL, GDRAW_HANDLE_STATE_user_owned);
|
|
return (GDrawTexture *) p;
|
|
}
|
|
|
|
extern void gdraw_D3D1X_(WrappedTextureChange)(GDrawTexture *tex, ID3D1X(ShaderResourceView) *tex_view)
|
|
{
|
|
GDrawHandle *p = (GDrawHandle *) tex;
|
|
p->handle.tex.d3d = NULL;
|
|
p->handle.tex.d3d_view = tex_view;
|
|
}
|
|
|
|
extern void gdraw_D3D1X_(WrappedTextureDestroy)(GDrawTexture *tex)
|
|
{
|
|
GDrawStats stats={0};
|
|
gdraw_res_free((GDrawHandle *) tex, &stats);
|
|
}
|
|
|
|
static void RADLINK gdraw_SetTextureUniqueID(GDrawTexture *tex, void *old_id, void *new_id)
|
|
{
|
|
GDrawHandle *p = (GDrawHandle *) tex;
|
|
// if this is still the handle it's thought to be, change the owner;
|
|
// if the owner *doesn't* match, then they're changing a stale handle, so ignore
|
|
if (p->owner == old_id)
|
|
p->owner = new_id;
|
|
}
|
|
|
|
|
|
static rrbool RADLINK gdraw_MakeTextureBegin(void *owner, S32 width, S32 height, gdraw_texture_format format, U32 flags, GDraw_MakeTexture_ProcessingInfo *p, GDrawStats *stats)
|
|
{
|
|
GDrawHandle *t = NULL;
|
|
DXGI_FORMAT dxgi_fmt;
|
|
S32 bpp, size = 0, nmips = 0;
|
|
|
|
if (width >= 16384 || height >= 16384) {
|
|
IggyGDrawSendWarning(NULL, "GDraw texture size too large (%d x %d), dimension limit is 16384", width, height);
|
|
return false;
|
|
}
|
|
|
|
if (format == GDRAW_TEXTURE_FORMAT_rgba32) {
|
|
dxgi_fmt = DXGI_FORMAT_R8G8B8A8_UNORM;
|
|
bpp = 4;
|
|
} else {
|
|
dxgi_fmt = DXGI_FORMAT_R8_UNORM;
|
|
bpp = 1;
|
|
}
|
|
|
|
// compute estimated size of texture in video memory
|
|
do {
|
|
size += RR_MAX(width >> nmips, 1) * RR_MAX(height >> nmips, 1) * bpp;
|
|
++nmips;
|
|
} while ((flags & GDRAW_MAKETEXTURE_FLAGS_mipmap) && ((width >> nmips) || (height >> nmips)));
|
|
|
|
// try to allocate memory for the client to write to
|
|
p->texture_data = (U8 *) IggyGDrawMalloc(size);
|
|
if (!p->texture_data) {
|
|
IggyGDrawSendWarning(NULL, "GDraw out of memory to store texture data to pass to D3D for %d x %d texture", width, height);
|
|
return false;
|
|
}
|
|
|
|
// allocate a handle and make room in the cache for this much data
|
|
t = gdraw_res_alloc_begin(gdraw->texturecache, size, stats);
|
|
if (!t) {
|
|
IggyGDrawFree(p->texture_data);
|
|
return false;
|
|
}
|
|
|
|
t->handle.tex.w = width;
|
|
t->handle.tex.h = height;
|
|
t->handle.tex.d3d = NULL;
|
|
t->handle.tex.d3d_view = NULL;
|
|
t->handle.tex.d3d_rtview = NULL;
|
|
|
|
p->texture_type = GDRAW_TEXTURE_TYPE_rgba;
|
|
p->p0 = t;
|
|
p->p1 = owner;
|
|
p->i0 = width;
|
|
p->i1 = height;
|
|
p->i2 = flags;
|
|
p->i3 = dxgi_fmt;
|
|
p->i4 = size;
|
|
p->i5 = nmips;
|
|
p->i6 = bpp;
|
|
|
|
p->stride_in_bytes = width * bpp;
|
|
p->num_rows = height;
|
|
|
|
return true;
|
|
}
|
|
|
|
static rrbool RADLINK gdraw_MakeTextureMore(GDraw_MakeTexture_ProcessingInfo * /*p*/)
|
|
{
|
|
return false;
|
|
}
|
|
|
|
static GDrawTexture * RADLINK gdraw_MakeTextureEnd(GDraw_MakeTexture_ProcessingInfo *p, GDrawStats *stats)
|
|
{
|
|
GDrawHandle *t = (GDrawHandle *) p->p0;
|
|
D3D1X_(SUBRESOURCE_DATA) mipdata[24];
|
|
S32 i, w, h, nmips, bpp;
|
|
HRESULT hr = S_OK;
|
|
const char *failed_call;
|
|
U8 *ptr;
|
|
|
|
// generate mip maps and set up descriptors for them
|
|
assert(p->i5 <= 24);
|
|
ptr = p->texture_data;
|
|
w = p->i0;
|
|
h = p->i1;
|
|
nmips = p->i5;
|
|
bpp = p->i6;
|
|
|
|
for (i=0; i < nmips; ++i) {
|
|
mipdata[i].pSysMem = ptr;
|
|
mipdata[i].SysMemPitch = RR_MAX(w >> i, 1) * bpp;
|
|
mipdata[i].SysMemSlicePitch = 0;
|
|
ptr += mipdata[i].SysMemPitch * RR_MAX(h >> i, 1);
|
|
|
|
// create mip data by downsampling
|
|
if (i)
|
|
gdraw_Downsample((U8 *) mipdata[i].pSysMem, mipdata[i].SysMemPitch, w >> i, h >> i,
|
|
(U8 *) mipdata[i-1].pSysMem, mipdata[i-1].SysMemPitch, bpp);
|
|
}
|
|
|
|
// actually create texture
|
|
D3D1X_(TEXTURE2D_DESC) desc = { static_cast<U32>(w), static_cast<U32>(h), static_cast<U32>(nmips), 1, static_cast<DXGI_FORMAT>(p->i3), { 1, 0 },
|
|
(p->i2 & GDRAW_MAKETEXTURE_FLAGS_updatable) ? D3D1X_(USAGE_DEFAULT) : D3D1X_(USAGE_IMMUTABLE),
|
|
D3D1X_(BIND_SHADER_RESOURCE), 0, 0 };
|
|
|
|
failed_call = "CreateTexture2D";
|
|
hr = gdraw->d3d_device->CreateTexture2D(&desc, mipdata, &t->handle.tex.d3d);
|
|
if (FAILED(hr)) goto done;
|
|
|
|
// and create a corresponding shader resource view
|
|
failed_call = "CreateShaderResourceView";
|
|
hr = gdraw->d3d_device->CreateShaderResourceView(t->handle.tex.d3d, NULL, &t->handle.tex.d3d_view);
|
|
|
|
done:
|
|
if (!FAILED(hr)) {
|
|
gdraw_HandleCacheAllocateEnd(t, p->i4, p->p1, (p->i2 & GDRAW_MAKETEXTURE_FLAGS_never_flush) ? GDRAW_HANDLE_STATE_pinned : GDRAW_HANDLE_STATE_locked);
|
|
stats->nonzero_flags |= GDRAW_STATS_alloc_tex;
|
|
stats->alloc_tex += 1;
|
|
stats->alloc_tex_bytes += p->i4;
|
|
} else {
|
|
safe_release(t->handle.tex.d3d);
|
|
safe_release(t->handle.tex.d3d_view);
|
|
|
|
gdraw_HandleCacheAllocateFail(t);
|
|
t = NULL;
|
|
report_d3d_error(hr, failed_call, " while creating texture");
|
|
}
|
|
|
|
IggyGDrawFree(p->texture_data);
|
|
return (GDrawTexture *) t;
|
|
}
|
|
|
|
static rrbool RADLINK gdraw_UpdateTextureBegin(GDrawTexture *t, void *unique_id, GDrawStats * /*stats*/)
|
|
{
|
|
return gdraw_HandleCacheLock((GDrawHandle *) t, unique_id);
|
|
}
|
|
|
|
static void RADLINK gdraw_UpdateTextureRect(GDrawTexture *t, void * /*unique_id*/, S32 x, S32 y, S32 stride, S32 w, S32 h, U8 *samples, gdraw_texture_format /*format*/)
|
|
{
|
|
GDrawHandle *s = (GDrawHandle *) t;
|
|
D3D1X_(BOX) box = { static_cast<U32>(x), static_cast<U32>(y), 0U, static_cast<U32>(x + w), static_cast<U32>(y + h), 1U };
|
|
|
|
gdraw->d3d_context->UpdateSubresource(s->handle.tex.d3d, 0, &box, samples, stride, 0);
|
|
}
|
|
|
|
static void RADLINK gdraw_UpdateTextureEnd(GDrawTexture *t, void * /*unique_id*/, GDrawStats * /*stats*/)
|
|
{
|
|
gdraw_HandleCacheUnlock((GDrawHandle *) t);
|
|
}
|
|
|
|
static void RADLINK gdraw_FreeTexture(GDrawTexture *tt, void *unique_id, GDrawStats *stats)
|
|
{
|
|
GDrawHandle *t = (GDrawHandle *) tt;
|
|
assert(t != NULL); // @GDRAW_ASSERT
|
|
if (t->owner == unique_id || unique_id == NULL) {
|
|
if (t->cache == &gdraw->rendertargets) {
|
|
gdraw_HandleCacheUnlock(t);
|
|
// cache it by simply not freeing it
|
|
return;
|
|
}
|
|
|
|
gdraw_res_free(t, stats);
|
|
}
|
|
}
|
|
|
|
static rrbool RADLINK gdraw_TryToLockTexture(GDrawTexture *t, void *unique_id, GDrawStats * /*stats*/)
|
|
{
|
|
return gdraw_HandleCacheLock((GDrawHandle *) t, unique_id);
|
|
}
|
|
|
|
static void RADLINK gdraw_DescribeTexture(GDrawTexture *tex, GDraw_Texture_Description *desc)
|
|
{
|
|
GDrawHandle *p = (GDrawHandle *) tex;
|
|
desc->width = p->handle.tex.w;
|
|
desc->height = p->handle.tex.h;
|
|
desc->size_in_bytes = p->bytes;
|
|
}
|
|
|
|
static void RADLINK gdraw_SetAntialiasTexture(S32 width, U8 *rgba)
|
|
{
|
|
HRESULT hr;
|
|
|
|
safe_release(gdraw->aa_tex_view);
|
|
safe_release(gdraw->aa_tex);
|
|
|
|
D3D1X_(TEXTURE2D_DESC) desc = { static_cast<U32>(width), 1U, 1U, 1U, DXGI_FORMAT_R8G8B8A8_UNORM, { 1, 0 }, D3D1X_(USAGE_IMMUTABLE), D3D1X_(BIND_SHADER_RESOURCE), 0U, 0U };
|
|
D3D1X_(SUBRESOURCE_DATA) data = { rgba, static_cast<U32>(width) * 4U, 0U };
|
|
|
|
hr = gdraw->d3d_device->CreateTexture2D(&desc, &data, &gdraw->aa_tex);
|
|
if (FAILED(hr)) {
|
|
report_d3d_error(hr, "CreateTexture2D", "");
|
|
return;
|
|
}
|
|
|
|
hr = gdraw->d3d_device->CreateShaderResourceView(gdraw->aa_tex, NULL, &gdraw->aa_tex_view);
|
|
if (FAILED(hr)) {
|
|
report_d3d_error(hr, "CreateShaderResourceView", " while creating texture");
|
|
safe_release(gdraw->aa_tex);
|
|
return;
|
|
}
|
|
}
|
|
|
|
////////////////////////////////////////////////////////////////////////
|
|
//
|
|
// Vertex buffer creation/deletion
|
|
//
|
|
|
|
static rrbool RADLINK gdraw_MakeVertexBufferBegin(void *unique_id, gdraw_vformat /*vformat*/, S32 vbuf_size, S32 ibuf_size, GDraw_MakeVertexBuffer_ProcessingInfo *p, GDrawStats *stats)
|
|
{
|
|
// prepare staging buffers for the app to put data into
|
|
p->vertex_data = (U8 *) IggyGDrawMalloc(vbuf_size);
|
|
p->index_data = (U8 *) IggyGDrawMalloc(ibuf_size);
|
|
if (p->vertex_data && p->index_data) {
|
|
GDrawHandle *vb = gdraw_res_alloc_begin(gdraw->vbufcache, vbuf_size + ibuf_size, stats);
|
|
if (vb) {
|
|
vb->handle.vbuf.verts = NULL;
|
|
vb->handle.vbuf.inds = NULL;
|
|
|
|
p->vertex_data_length = vbuf_size;
|
|
p->index_data_length = ibuf_size;
|
|
p->p0 = vb;
|
|
p->p1 = unique_id;
|
|
return true;
|
|
}
|
|
}
|
|
|
|
if (p->vertex_data)
|
|
IggyGDrawFree(p->vertex_data);
|
|
if (p->index_data)
|
|
IggyGDrawFree(p->index_data);
|
|
|
|
return false;
|
|
}
|
|
|
|
static rrbool RADLINK gdraw_MakeVertexBufferMore(GDraw_MakeVertexBuffer_ProcessingInfo * /*p*/)
|
|
{
|
|
assert(0);
|
|
return false;
|
|
}
|
|
|
|
static GDrawVertexBuffer * RADLINK gdraw_MakeVertexBufferEnd(GDraw_MakeVertexBuffer_ProcessingInfo *p, GDrawStats * /*stats*/)
|
|
{
|
|
GDrawHandle *vb = (GDrawHandle *) p->p0;
|
|
|
|
HRESULT hr;
|
|
D3D1X_(BUFFER_DESC) vbdesc = { static_cast<U32>(p->vertex_data_length), D3D1X_(USAGE_IMMUTABLE), D3D1X_(BIND_VERTEX_BUFFER), 0U, 0U };
|
|
D3D1X_(SUBRESOURCE_DATA) vbdata = { p->vertex_data, 0, 0 };
|
|
|
|
D3D1X_(BUFFER_DESC) ibdesc = { static_cast<U32>(p->index_data_length), D3D1X_(USAGE_IMMUTABLE), D3D1X_(BIND_INDEX_BUFFER), 0U, 0U };
|
|
D3D1X_(SUBRESOURCE_DATA) ibdata = { p->index_data, 0, 0 };
|
|
|
|
hr = gdraw->d3d_device->CreateBuffer(&vbdesc, &vbdata, &vb->handle.vbuf.verts);
|
|
if (!FAILED(hr))
|
|
hr = gdraw->d3d_device->CreateBuffer(&ibdesc, &ibdata, &vb->handle.vbuf.inds);
|
|
|
|
if (FAILED(hr)) {
|
|
safe_release(vb->handle.vbuf.verts);
|
|
safe_release(vb->handle.vbuf.inds);
|
|
|
|
gdraw_HandleCacheAllocateFail(vb);
|
|
vb = NULL;
|
|
|
|
report_d3d_error(hr, "CreateBuffer", " creating vertex buffer");
|
|
} else {
|
|
gdraw_HandleCacheAllocateEnd(vb, p->vertex_data_length + p->index_data_length, p->p1, GDRAW_HANDLE_STATE_locked);
|
|
}
|
|
|
|
IggyGDrawFree(p->vertex_data);
|
|
IggyGDrawFree(p->index_data);
|
|
|
|
return (GDrawVertexBuffer *) vb;
|
|
}
|
|
|
|
static rrbool RADLINK gdraw_TryLockVertexBuffer(GDrawVertexBuffer *vb, void *unique_id, GDrawStats * /*stats*/)
|
|
{
|
|
return gdraw_HandleCacheLock((GDrawHandle *) vb, unique_id);
|
|
}
|
|
|
|
static void RADLINK gdraw_FreeVertexBuffer(GDrawVertexBuffer *vb, void *unique_id, GDrawStats *stats)
|
|
{
|
|
GDrawHandle *h = (GDrawHandle *) vb;
|
|
assert(h != NULL); // @GDRAW_ASSERT
|
|
if (h->owner == unique_id)
|
|
gdraw_res_free(h, stats);
|
|
}
|
|
|
|
static void RADLINK gdraw_DescribeVertexBuffer(GDrawVertexBuffer *vbuf, GDraw_VertexBuffer_Description *desc)
|
|
{
|
|
GDrawHandle *p = (GDrawHandle *) vbuf;
|
|
desc->size_in_bytes = p->bytes;
|
|
}
|
|
|
|
////////////////////////////////////////////////////////////////////////
|
|
//
|
|
// Create/free (or cache) render targets
|
|
//
|
|
|
|
static GDrawHandle *get_color_rendertarget(GDrawStats *stats)
|
|
{
|
|
const char *failed_call;
|
|
|
|
// try to recycle LRU rendertarget
|
|
GDrawHandle *t = gdraw_HandleCacheGetLRU(&gdraw->rendertargets);
|
|
if (t) {
|
|
gdraw_HandleCacheLock(t, (void *) 1);
|
|
return t;
|
|
}
|
|
|
|
// ran out of RTs, allocate a new one
|
|
S32 size = gdraw->frametex_width * gdraw->frametex_height * 4;
|
|
if (gdraw->rendertargets.bytes_free < size) {
|
|
IggyGDrawSendWarning(NULL, "GDraw rendertarget allocation failed: hit size limit of %d bytes", gdraw->rendertargets.total_bytes);
|
|
return NULL;
|
|
}
|
|
|
|
t = gdraw_HandleCacheAllocateBegin(&gdraw->rendertargets);
|
|
if (!t) {
|
|
IggyGDrawSendWarning(NULL, "GDraw rendertarget allocation failed: hit handle limit");
|
|
return t;
|
|
}
|
|
|
|
D3D1X_(TEXTURE2D_DESC) desc = { static_cast<U32>(gdraw->frametex_width), static_cast<U32>(gdraw->frametex_height), 1U, 1U, DXGI_FORMAT_R8G8B8A8_UNORM, { 1, 0 },
|
|
D3D1X_(USAGE_DEFAULT), D3D1X_(BIND_SHADER_RESOURCE) | D3D1X_(BIND_RENDER_TARGET), 0U, 0U };
|
|
|
|
t->handle.tex.d3d = NULL;
|
|
t->handle.tex.d3d_view = NULL;
|
|
t->handle.tex.d3d_rtview = NULL;
|
|
|
|
HRESULT hr = gdraw->d3d_device->CreateTexture2D(&desc, NULL, &t->handle.tex.d3d);
|
|
failed_call = "CreateTexture2D";
|
|
if (!FAILED(hr)) {
|
|
hr = gdraw->d3d_device->CreateShaderResourceView(t->handle.tex.d3d, NULL, &t->handle.tex.d3d_view);
|
|
failed_call = "CreateTexture2D";
|
|
}
|
|
if (!FAILED(hr)) {
|
|
hr = gdraw->d3d_device->CreateRenderTargetView(t->handle.tex.d3d, NULL, &t->handle.tex.d3d_rtview);
|
|
failed_call = "CreateRenderTargetView";
|
|
}
|
|
|
|
if (FAILED(hr)) {
|
|
safe_release(t->handle.tex.d3d);
|
|
safe_release(t->handle.tex.d3d_view);
|
|
safe_release(t->handle.tex.d3d_rtview);
|
|
gdraw_HandleCacheAllocateFail(t);
|
|
|
|
report_d3d_error(hr, failed_call, " creating rendertarget");
|
|
|
|
return NULL;
|
|
}
|
|
|
|
gdraw_HandleCacheAllocateEnd(t, size, (void *) 1, GDRAW_HANDLE_STATE_locked);
|
|
stats->nonzero_flags |= GDRAW_STATS_alloc_tex;
|
|
stats->alloc_tex += 1;
|
|
stats->alloc_tex_bytes += size;
|
|
|
|
return t;
|
|
}
|
|
|
|
static ID3D1X(DepthStencilView) *get_rendertarget_depthbuffer(GDrawStats *stats)
|
|
{
|
|
if (!gdraw->depth_buffer[1]) {
|
|
const char *failed_call;
|
|
assert(!gdraw->rt_depth_buffer);
|
|
|
|
D3D1X_(TEXTURE2D_DESC) desc = { static_cast<U32>(gdraw->frametex_width), static_cast<U32>(gdraw->frametex_height), 1U, 1U, DXGI_FORMAT_D24_UNORM_S8_UINT, { 1, 0 },
|
|
D3D1X_(USAGE_DEFAULT), D3D1X_(BIND_DEPTH_STENCIL), 0U, 0U };
|
|
|
|
HRESULT hr = gdraw->d3d_device->CreateTexture2D(&desc, NULL, &gdraw->rt_depth_buffer);
|
|
failed_call = "CreateTexture2D";
|
|
if (!FAILED(hr)) {
|
|
hr = gdraw->d3d_device->CreateDepthStencilView(gdraw->rt_depth_buffer, NULL, &gdraw->depth_buffer[1]);
|
|
failed_call = "CreateDepthStencilView while creating rendertarget";
|
|
}
|
|
|
|
if (FAILED(hr)) {
|
|
report_d3d_error(hr, failed_call, "");
|
|
safe_release(gdraw->rt_depth_buffer);
|
|
safe_release(gdraw->depth_buffer[1]);
|
|
} else {
|
|
stats->nonzero_flags |= GDRAW_STATS_alloc_tex;
|
|
stats->alloc_tex += 1;
|
|
stats->alloc_tex_bytes += gdraw->frametex_width * gdraw->frametex_height * 4;
|
|
|
|
gdraw->d3d_context->ClearDepthStencilView(gdraw->depth_buffer[1], D3D1X_(CLEAR_DEPTH) | D3D1X_(CLEAR_STENCIL), 1.0f, 0);
|
|
}
|
|
}
|
|
|
|
return gdraw->depth_buffer[1];
|
|
}
|
|
|
|
static void flush_rendertargets(GDrawStats *stats)
|
|
{
|
|
gdraw_res_flush(&gdraw->rendertargets, stats);
|
|
|
|
safe_release(gdraw->depth_buffer[1]);
|
|
safe_release(gdraw->rt_depth_buffer);
|
|
}
|
|
|
|
////////////////////////////////////////////////////////////////////////
|
|
//
|
|
// Constant buffer layouts
|
|
//
|
|
|
|
struct VertexVars
|
|
{
|
|
F32 world[2][4];
|
|
F32 x_off[4];
|
|
F32 texgen_s[4];
|
|
F32 texgen_t[4];
|
|
F32 x3d[4];
|
|
F32 y3d[4];
|
|
F32 w3d[4];
|
|
};
|
|
|
|
struct PixelCommonVars
|
|
{
|
|
F32 color_mul[4];
|
|
F32 color_add[4];
|
|
F32 focal[4];
|
|
F32 rescale1[4];
|
|
};
|
|
|
|
struct PixelParaFilter
|
|
{
|
|
F32 clamp0[4], clamp1[4];
|
|
F32 color[4], color2[4];
|
|
F32 tc_off[4];
|
|
};
|
|
|
|
struct PixelParaBlur
|
|
{
|
|
F32 clamp[4];
|
|
F32 tap[9][4];
|
|
};
|
|
|
|
struct PixelParaColorMatrix
|
|
{
|
|
F32 data[5][4];
|
|
};
|
|
|
|
////////////////////////////////////////////////////////////////////////
|
|
//
|
|
// Rendering helpers
|
|
//
|
|
|
|
static void disable_scissor(int force)
|
|
{
|
|
if (force || gdraw->scissor_state) {
|
|
// disable scissor by setting whole viewport as scissor rect
|
|
S32 x = gdraw->cview.x;
|
|
S32 y = gdraw->cview.y;
|
|
D3D1X_(RECT) r = { x, y, x + gdraw->cview.w, y + gdraw->cview.h };
|
|
|
|
gdraw->d3d_context->RSSetScissorRects(1, &r);
|
|
gdraw->scissor_state = 0;
|
|
}
|
|
}
|
|
|
|
static void set_viewport_raw(S32 x, S32 y, S32 w, S32 h)
|
|
{
|
|
D3D1X_(VIEWPORT) vp = { (ViewCoord) x, (ViewCoord) y, (ViewCoord) w, (ViewCoord) h, 0.0f, 1.0f };
|
|
gdraw->d3d_context->RSSetViewports(1, &vp);
|
|
gdraw->cview.x = x;
|
|
gdraw->cview.y = y;
|
|
gdraw->cview.w = w;
|
|
gdraw->cview.h = h;
|
|
|
|
disable_scissor(1);
|
|
}
|
|
|
|
static void set_projection_base(void)
|
|
{
|
|
// x3d = < viewproj.x, 0, 0, 0 >
|
|
// y3d = < 0, viewproj.y, 0, 0 >
|
|
// w3d = < viewproj.z, viewproj.w, 1.0, 1.0 >
|
|
|
|
memset(gdraw->projmat[0], 0, sizeof(gdraw->projmat));
|
|
gdraw->projmat[0][0] = gdraw->projection[0];
|
|
gdraw->projmat[1][1] = gdraw->projection[1];
|
|
gdraw->projmat[2][0] = gdraw->projection[2];
|
|
gdraw->projmat[2][1] = gdraw->projection[3];
|
|
|
|
gdraw->projmat[2][2] = 1.0;
|
|
gdraw->projmat[2][3] = 1.0;
|
|
}
|
|
|
|
static void set_projection_raw(S32 x0, S32 x1, S32 y0, S32 y1)
|
|
{
|
|
gdraw->projection[0] = 2.0f / (x1-x0);
|
|
gdraw->projection[1] = 2.0f / (y1-y0);
|
|
gdraw->projection[2] = (x1+x0)/(F32)(x0-x1);
|
|
gdraw->projection[3] = (y1+y0)/(F32)(y0-y1);
|
|
|
|
set_projection_base();
|
|
}
|
|
|
|
|
|
static void set_viewport(void)
|
|
{
|
|
if (gdraw->in_blur) {
|
|
set_viewport_raw(0, 0, gdraw->tpw, gdraw->tph);
|
|
return;
|
|
}
|
|
|
|
if (gdraw->cur == gdraw->frame) // if the rendering stack is empty
|
|
// render a tile-sized region to the user-request tile location
|
|
set_viewport_raw(gdraw->vx, gdraw->vy, gdraw->tw, gdraw->th);
|
|
else if (gdraw->cur->cached)
|
|
set_viewport_raw(0, 0, gdraw->cur->width, gdraw->cur->height);
|
|
else
|
|
// if on the render stack, draw a padded-tile-sized region at the origin
|
|
set_viewport_raw(0, 0, gdraw->tpw, gdraw->tph);
|
|
}
|
|
|
|
static void set_projection(void)
|
|
{
|
|
if (gdraw->in_blur) return;
|
|
if (gdraw->cur == gdraw->frame) // if the render stack is empty
|
|
set_projection_raw(gdraw->tx0, gdraw->tx0+gdraw->tw, gdraw->ty0+gdraw->th, gdraw->ty0);
|
|
else if (gdraw->cur->cached)
|
|
set_projection_raw(gdraw->cur->base_x, gdraw->cur->base_x+gdraw->cur->width, gdraw->cur->base_y, gdraw->cur->base_y+gdraw->cur->height);
|
|
else
|
|
set_projection_raw(gdraw->tx0p, gdraw->tx0p+gdraw->tpw, gdraw->ty0p+gdraw->tph, gdraw->ty0p);
|
|
}
|
|
|
|
static void clear_renderstate(void)
|
|
{
|
|
gdraw->d3d_context->ClearState();
|
|
}
|
|
|
|
static void set_common_renderstate()
|
|
{
|
|
ID3D1XContext *d3d = gdraw->d3d_context;
|
|
S32 i;
|
|
|
|
clear_renderstate();
|
|
|
|
// all the render states we never change while drawing
|
|
d3d->IASetPrimitiveTopology(D3D1X_(PRIMITIVE_TOPOLOGY_TRIANGLELIST));
|
|
|
|
d3d->PSSetShaderResources(7, 1, &gdraw->aa_tex_view);
|
|
d3d->PSSetSamplers(7, 1, &gdraw->sampler_state[0][GDRAW_WRAP_clamp]);
|
|
|
|
// set a well-defined default sampler for all PS textures we use
|
|
for (i=0; i < 3; ++i)
|
|
d3d->PSSetSamplers(i, 1, &gdraw->sampler_state[0][GDRAW_WRAP_clamp]);
|
|
|
|
// reset our state caching
|
|
gdraw->scissor_state = ~0u;
|
|
gdraw->blend_mode = -1;
|
|
}
|
|
|
|
static void manual_clear(gswf_recti *r, GDrawStats *stats);
|
|
static void set_render_target(GDrawStats *stats);
|
|
|
|
////////////////////////////////////////////////////////////////////////
|
|
//
|
|
// Begin/end rendering of a tile and per-frame processing
|
|
//
|
|
|
|
void gdraw_D3D1X_(SetRendertargetSize)(S32 w, S32 h)
|
|
{
|
|
if (gdraw && (w != gdraw->frametex_width || h != gdraw->frametex_height)) {
|
|
GDrawStats stats = { 0 };
|
|
gdraw->frametex_width = w;
|
|
gdraw->frametex_height = h;
|
|
flush_rendertargets(&stats);
|
|
}
|
|
}
|
|
|
|
void gdraw_D3D1X_(SetTileOrigin)(ID3D1X(RenderTargetView) *main_rt, ID3D1X(DepthStencilView) *main_ds, ID3D1X(ShaderResourceView) *non_msaa_rt, S32 x, S32 y)
|
|
{
|
|
|
|
if (!gdraw) return; // AAR - saftey check because windows calls resize early
|
|
|
|
D3D1X_(RENDER_TARGET_VIEW_DESC) desc;
|
|
|
|
if (gdraw->frame_done) {
|
|
++gdraw->frame_counter;
|
|
gdraw->frame_done = false;
|
|
}
|
|
|
|
main_rt->GetDesc(&desc);
|
|
|
|
gdraw->main_framebuffer = main_rt;
|
|
gdraw->main_resolve_target = non_msaa_rt;
|
|
gdraw->main_msaa = (desc.ViewDimension == D3D1X_(RTV_DIMENSION_TEXTURE2DMS));
|
|
gdraw->depth_buffer[0] = main_ds;
|
|
|
|
gdraw->vx = x;
|
|
gdraw->vy = y;
|
|
}
|
|
|
|
static void RADLINK gdraw_SetViewSizeAndWorldScale(S32 w, S32 h, F32 scalex, F32 scaley)
|
|
{
|
|
static S32 s_lastW = 0, s_lastH = 0;
|
|
static F32 s_lastSx = 0, s_lastSy = 0;
|
|
if (w != s_lastW || h != s_lastH || scalex != s_lastSx || scaley != s_lastSy) {
|
|
app.DebugPrintf("[GDRAW] SetViewSize: fw=%d fh=%d scale=%.6f,%.6f frametex=%dx%d vx=%d vy=%d\n",
|
|
w, h, scalex, scaley, gdraw->frametex_width, gdraw->frametex_height, gdraw->vx, gdraw->vy);
|
|
s_lastW = w; s_lastH = h; s_lastSx = scalex; s_lastSy = scaley;
|
|
}
|
|
memset(gdraw->frame, 0, sizeof(gdraw->frame));
|
|
gdraw->cur = gdraw->frame;
|
|
gdraw->fw = w;
|
|
gdraw->fh = h;
|
|
gdraw->tw = w;
|
|
gdraw->th = h;
|
|
gdraw->world_to_pixel[0] = scalex;
|
|
gdraw->world_to_pixel[1] = scaley;
|
|
set_viewport();
|
|
}
|
|
|
|
// must include anything necessary for texture creation/update
|
|
static void RADLINK gdraw_RenderingBegin(void)
|
|
{
|
|
}
|
|
static void RADLINK gdraw_RenderingEnd(void)
|
|
{
|
|
}
|
|
|
|
static void RADLINK gdraw_RenderTileBegin(S32 x0, S32 y0, S32 x1, S32 y1, S32 pad, GDrawStats *stats)
|
|
{
|
|
if (x0 == 0 && y0 == 0 && x1 == gdraw->fw && y1 == gdraw->fh)
|
|
pad = 0;
|
|
|
|
gdraw->tx0 = x0;
|
|
gdraw->ty0 = y0;
|
|
gdraw->tw = x1-x0;
|
|
gdraw->th = y1-y0;
|
|
|
|
// padded region
|
|
gdraw->tx0p = RR_MAX(x0 - pad, 0);
|
|
gdraw->ty0p = RR_MAX(y0 - pad, 0);
|
|
gdraw->tpw = RR_MIN(x1 + pad, gdraw->fw) - gdraw->tx0p;
|
|
gdraw->tph = RR_MIN(y1 + pad, gdraw->fh) - gdraw->ty0p;
|
|
|
|
// make sure our rendertargets are large enough to contain the tile
|
|
if (gdraw->tpw > gdraw->frametex_width || gdraw->tph > gdraw->frametex_height) {
|
|
gdraw->frametex_width = RR_MAX(gdraw->tpw, gdraw->frametex_width);
|
|
gdraw->frametex_height = RR_MAX(gdraw->tph, gdraw->frametex_height);
|
|
|
|
flush_rendertargets(stats);
|
|
}
|
|
assert(gdraw->tpw <= gdraw->frametex_width && gdraw->tph <= gdraw->frametex_height);
|
|
|
|
// set up rendertargets we'll use
|
|
set_common_renderstate();
|
|
gdraw->d3d_context->ClearDepthStencilView(gdraw->depth_buffer[0], D3D1X_(CLEAR_DEPTH) | D3D1X_(CLEAR_STENCIL), 1.0f, 0);
|
|
if (gdraw->depth_buffer[1])
|
|
gdraw->d3d_context->ClearDepthStencilView(gdraw->depth_buffer[1], D3D1X_(CLEAR_DEPTH) | D3D1X_(CLEAR_STENCIL), 1.0f, 0);
|
|
|
|
set_projection();
|
|
set_viewport();
|
|
set_render_target(stats);
|
|
}
|
|
|
|
static void RADLINK gdraw_RenderTileEnd(GDrawStats * /*stats*/)
|
|
{
|
|
}
|
|
|
|
void gdraw_D3D1X_(NoMoreGDrawThisFrame)(void)
|
|
{
|
|
clear_renderstate();
|
|
gdraw->frame_done = true;
|
|
|
|
gdraw->last_dyn_maxalloc = gdraw->dyn_maxalloc;
|
|
gdraw->dyn_maxalloc = 0;
|
|
|
|
// reset dynamic buffer alloc position so they get DISCARDed
|
|
// next time around.
|
|
gdraw->dyn_vb.alloc_pos = 0;
|
|
gdraw->dyn_ib.alloc_pos = 0;
|
|
|
|
GDrawFence now = { gdraw->frame_counter };
|
|
gdraw_HandleCacheTick(gdraw->texturecache, now);
|
|
gdraw_HandleCacheTick(gdraw->vbufcache, now);
|
|
}
|
|
|
|
#define MAX_DEPTH_VALUE (1 << 13)
|
|
|
|
static void RADLINK gdraw_GetInfo(GDrawInfo *d)
|
|
{
|
|
d->num_stencil_bits = 8;
|
|
d->max_id = MAX_DEPTH_VALUE-2;
|
|
// for floating point depth, just use mantissa, e.g. 16-20 bits
|
|
d->buffer_format = GDRAW_BFORMAT_vbib;
|
|
d->shared_depth_stencil = 1;
|
|
d->always_mipmap = 1;
|
|
#ifndef GDRAW_D3D11_LEVEL9
|
|
d->max_texture_size = 8192;
|
|
d->conditional_nonpow2 = 0;
|
|
#else
|
|
d->max_texture_size = 2048;
|
|
d->conditional_nonpow2 = 1;
|
|
#endif
|
|
}
|
|
|
|
////////////////////////////////////////////////////////////////////////
|
|
//
|
|
// Enable/disable rendertargets in stack fashion
|
|
//
|
|
|
|
static ID3D1X(RenderTargetView) *get_active_render_target()
|
|
{
|
|
if (gdraw->cur->color_buffer) {
|
|
unbind_resources(); // to make sure this RT isn't accidentally set as a texture (avoid D3D warnings)
|
|
return gdraw->cur->color_buffer->handle.tex.d3d_rtview;
|
|
} else
|
|
return gdraw->main_framebuffer;
|
|
}
|
|
|
|
static void set_render_target(GDrawStats *stats)
|
|
{
|
|
ID3D1X(RenderTargetView) *target = get_active_render_target();
|
|
if (target == gdraw->main_framebuffer) {
|
|
gdraw->d3d_context->OMSetRenderTargets(1, &target, gdraw->depth_buffer[0]);
|
|
gdraw->d3d_context->RSSetState(gdraw->raster_state[gdraw->main_msaa]);
|
|
} else {
|
|
ID3D1X(DepthStencilView) *depth = NULL;
|
|
if (gdraw->cur->flags & (GDRAW_TEXTUREDRAWBUFFER_FLAGS_needs_id | GDRAW_TEXTUREDRAWBUFFER_FLAGS_needs_stencil))
|
|
depth = get_rendertarget_depthbuffer(stats);
|
|
|
|
gdraw->d3d_context->OMSetRenderTargets(1, &target, depth);
|
|
gdraw->d3d_context->RSSetState(gdraw->raster_state[0]);
|
|
}
|
|
|
|
stats->nonzero_flags |= GDRAW_STATS_rendtarg;
|
|
stats->rendertarget_changes += 1;
|
|
}
|
|
|
|
static rrbool RADLINK gdraw_TextureDrawBufferBegin(gswf_recti *region, gdraw_texture_format /*format*/, U32 flags, void *owner, GDrawStats *stats)
|
|
{
|
|
GDrawFramebufferState *n = gdraw->cur+1;
|
|
GDrawHandle *t = NULL;
|
|
if (gdraw->tw == 0 || gdraw->th == 0) {
|
|
IggyGDrawSendWarning(NULL, "GDraw warning: w=0,h=0 rendertarget");
|
|
return false;
|
|
}
|
|
|
|
if (n >= &gdraw->frame[MAX_RENDER_STACK_DEPTH]) {
|
|
assert(0);
|
|
IggyGDrawSendWarning(NULL, "GDraw rendertarget nesting exceeds MAX_RENDER_STACK_DEPTH");
|
|
return false;
|
|
}
|
|
|
|
if (owner) {
|
|
// nyi
|
|
} else {
|
|
t = get_color_rendertarget(stats);
|
|
if (!t)
|
|
return false;
|
|
}
|
|
|
|
n->flags = flags;
|
|
n->color_buffer = t;
|
|
assert(n->color_buffer != NULL); // @GDRAW_ASSERT
|
|
|
|
++gdraw->cur;
|
|
gdraw->cur->cached = owner != NULL;
|
|
if (owner) {
|
|
gdraw->cur->base_x = region->x0;
|
|
gdraw->cur->base_y = region->y0;
|
|
gdraw->cur->width = region->x1 - region->x0;
|
|
gdraw->cur->height = region->y1 - region->y0;
|
|
}
|
|
|
|
set_render_target(stats);
|
|
assert(gdraw->frametex_width >= gdraw->tw && gdraw->frametex_height >= gdraw->th); // @GDRAW_ASSERT
|
|
|
|
S32 k = (S32) (t - gdraw->rendertargets.handle);
|
|
|
|
if (region) {
|
|
gswf_recti r;
|
|
S32 ox, oy, pad = 2; // 2 pixels of border on all sides
|
|
// 1 pixel turns out to be not quite enough with the interpolator precision we get.
|
|
|
|
if (gdraw->in_blur)
|
|
ox = oy = 0;
|
|
else
|
|
ox = gdraw->tx0p, oy = gdraw->ty0p;
|
|
|
|
// clamp region to tile
|
|
S32 xt0 = RR_MAX(region->x0 - ox, 0);
|
|
S32 yt0 = RR_MAX(region->y0 - oy, 0);
|
|
S32 xt1 = RR_MIN(region->x1 - ox, gdraw->tpw);
|
|
S32 yt1 = RR_MIN(region->y1 - oy, gdraw->tph);
|
|
|
|
// but the padding needs to clamp to render target bounds
|
|
r.x0 = RR_MAX(xt0 - pad, 0);
|
|
r.y0 = RR_MAX(yt0 - pad, 0);
|
|
r.x1 = RR_MIN(xt1 + pad, gdraw->frametex_width);
|
|
r.y1 = RR_MIN(yt1 + pad, gdraw->frametex_height);
|
|
|
|
if (r.x1 <= r.x0 || r.y1 <= r.y0) { // region doesn't intersect with current tile
|
|
--gdraw->cur;
|
|
gdraw_FreeTexture((GDrawTexture *) t, 0, stats);
|
|
// note: don't send a warning since this will happen during regular tiled rendering
|
|
return false;
|
|
}
|
|
|
|
manual_clear(&r, stats);
|
|
|
|
gdraw->rt_valid[k].x0 = xt0;
|
|
gdraw->rt_valid[k].y0 = yt0;
|
|
gdraw->rt_valid[k].x1 = xt1;
|
|
gdraw->rt_valid[k].y1 = yt1;
|
|
} else {
|
|
gdraw->d3d_context->ClearRenderTargetView(gdraw->cur->color_buffer->handle.tex.d3d_rtview, four_zeros);
|
|
gdraw->rt_valid[k].x0 = 0;
|
|
gdraw->rt_valid[k].y0 = 0;
|
|
gdraw->rt_valid[k].x1 = gdraw->frametex_width;
|
|
gdraw->rt_valid[k].y1 = gdraw->frametex_height;
|
|
}
|
|
|
|
if (!gdraw->in_blur) {
|
|
set_viewport();
|
|
set_projection();
|
|
} else {
|
|
set_viewport_raw(0, 0, gdraw->tpw, gdraw->tph);
|
|
set_projection_raw(0, gdraw->tpw, gdraw->tph, 0);
|
|
}
|
|
|
|
return true;
|
|
}
|
|
|
|
static GDrawTexture *RADLINK gdraw_TextureDrawBufferEnd(GDrawStats *stats)
|
|
{
|
|
GDrawFramebufferState *n = gdraw->cur;
|
|
GDrawFramebufferState *m = --gdraw->cur;
|
|
if (gdraw->tw == 0 || gdraw->th == 0) return 0;
|
|
|
|
if (n >= &gdraw->frame[MAX_RENDER_STACK_DEPTH])
|
|
return 0; // already returned a warning in Begin
|
|
|
|
assert(m >= gdraw->frame); // bug in Iggy -- unbalanced
|
|
|
|
if (m != gdraw->frame) {
|
|
assert(m->color_buffer != NULL); // @GDRAW_ASSERT
|
|
}
|
|
assert(n->color_buffer != NULL); // @GDRAW_ASSERT
|
|
|
|
// switch back to old render target
|
|
set_render_target(stats);
|
|
|
|
// if we're at the root, set the viewport back
|
|
set_viewport();
|
|
set_projection();
|
|
|
|
return (GDrawTexture *) n->color_buffer;
|
|
}
|
|
|
|
|
|
////////////////////////////////////////////////////////////////////////
|
|
//
|
|
// Clear stencil/depth buffers
|
|
//
|
|
// Open question whether we'd be better off finding bounding boxes
|
|
// and only clearing those; it depends exactly how fast clearing works.
|
|
//
|
|
|
|
static void RADLINK gdraw_ClearStencilBits(U32 /*bits*/)
|
|
{
|
|
gdraw->d3d_context->ClearDepthStencilView(gdraw->depth_buffer[0], D3D1X_(CLEAR_STENCIL), 1.0f, 0);
|
|
if (gdraw->depth_buffer[1])
|
|
gdraw->d3d_context->ClearDepthStencilView(gdraw->depth_buffer[1], D3D1X_(CLEAR_STENCIL), 1.0f, 0);
|
|
}
|
|
|
|
// this only happens rarely (hopefully never) if we use the depth buffer,
|
|
// so we can just clear the whole thing
|
|
static void RADLINK gdraw_ClearID(void)
|
|
{
|
|
gdraw->d3d_context->ClearDepthStencilView(gdraw->depth_buffer[0], D3D1X_(CLEAR_DEPTH), 1.0f, 0);
|
|
if (gdraw->depth_buffer[1])
|
|
gdraw->d3d_context->ClearDepthStencilView(gdraw->depth_buffer[1], D3D1X_(CLEAR_DEPTH), 1.0f, 0);
|
|
}
|
|
|
|
////////////////////////////////////////////////////////////////////////
|
|
//
|
|
// Set all the render state from GDrawRenderState
|
|
//
|
|
// This also is responsible for getting the framebuffer into a texture
|
|
// if the read-modify-write blend operation can't be expressed with
|
|
// the native blend operators. (E.g. "screen")
|
|
//
|
|
|
|
// convert an ID request to a value suitable for the depth buffer,
|
|
// assuming the depth buffer has been mappped to 0..1
|
|
static F32 depth_from_id(S32 id)
|
|
{
|
|
return 1.0f - ((F32) id + 1.0f) / MAX_DEPTH_VALUE;
|
|
}
|
|
|
|
static void set_texture(S32 texunit, GDrawTexture *tex, rrbool nearest, S32 wrap)
|
|
{
|
|
ID3D1XContext *d3d = gdraw->d3d_context;
|
|
|
|
if (tex == NULL) {
|
|
ID3D1X(ShaderResourceView) *notex = NULL;
|
|
d3d->PSSetShaderResources(texunit, 1, ¬ex);
|
|
} else {
|
|
GDrawHandle *h = (GDrawHandle *) tex;
|
|
d3d->PSSetShaderResources(texunit, 1, &h->handle.tex.d3d_view);
|
|
d3d->PSSetSamplers(texunit, 1, &gdraw->sampler_state[nearest][wrap]);
|
|
}
|
|
}
|
|
|
|
static void RADLINK gdraw_Set3DTransform(F32 *mat)
|
|
{
|
|
if (mat == NULL)
|
|
gdraw->use_3d = 0;
|
|
else {
|
|
gdraw->use_3d = 1;
|
|
memcpy(gdraw->xform_3d, mat, sizeof(gdraw->xform_3d));
|
|
}
|
|
}
|
|
|
|
static int set_renderstate_full(S32 vertex_format, GDrawRenderState *r, GDrawStats * /* stats */, const F32 *rescale1)
|
|
{
|
|
ID3D1XContext *d3d = gdraw->d3d_context;
|
|
|
|
// set vertex shader
|
|
set_vertex_shader(d3d, gdraw->vert[vertex_format].vshader);
|
|
|
|
// set vertex shader constants
|
|
if (VertexVars *vvars = (VertexVars *) map_buffer(gdraw->d3d_context, gdraw->cb_vertex, true)) {
|
|
F32 depth = depth_from_id(r->id);
|
|
if (!r->use_world_space)
|
|
gdraw_ObjectSpace(vvars->world[0], r->o2w, depth, 0.0f);
|
|
else
|
|
gdraw_WorldSpace(vvars->world[0], gdraw->world_to_pixel, depth, 0.0f);
|
|
|
|
memcpy(&vvars->x_off, r->edge_matrix, 4*sizeof(F32));
|
|
|
|
if (r->texgen0_enabled) {
|
|
memcpy(&vvars->texgen_s, r->s0_texgen, 4*sizeof(F32));
|
|
memcpy(&vvars->texgen_t, r->t0_texgen, 4*sizeof(F32));
|
|
}
|
|
|
|
if (gdraw->use_3d)
|
|
memcpy(vvars->x3d, gdraw->xform_3d, 12*sizeof(F32));
|
|
else
|
|
memcpy(vvars->x3d, gdraw->projmat, 12*sizeof(F32));
|
|
|
|
unmap_buffer(gdraw->d3d_context, gdraw->cb_vertex);
|
|
|
|
d3d->VSSetConstantBuffers(0, 1, &gdraw->cb_vertex);
|
|
}
|
|
|
|
// set the blend mode
|
|
int blend_mode = r->blend_mode;
|
|
if (blend_mode != gdraw->blend_mode) {
|
|
gdraw->blend_mode = blend_mode;
|
|
d3d->OMSetBlendState(gdraw->blend_state[blend_mode], four_zeros, ~0u);
|
|
}
|
|
|
|
// set the fragment program
|
|
if (blend_mode != GDRAW_BLEND_special) {
|
|
int which = r->tex0_mode;
|
|
assert(which >= 0 && which < sizeof(gdraw->fprog) / sizeof(*gdraw->fprog));
|
|
|
|
int additive = 0;
|
|
if (r->cxf_add) {
|
|
additive = 1;
|
|
if (r->cxf_add[3]) additive = 2;
|
|
}
|
|
|
|
ID3D1X(PixelShader) *program = gdraw->fprog[which][additive].pshader;
|
|
if (r->stencil_set) {
|
|
// in stencil set mode, prefer not doing any shading at all
|
|
// but if alpha test is on, we need to make an exception
|
|
|
|
#ifndef GDRAW_D3D11_LEVEL9 // level9 can't do NULL PS it seems
|
|
if (which != GDRAW_TEXTURE_alpha_test)
|
|
program = NULL;
|
|
else
|
|
#endif
|
|
{
|
|
gdraw->blend_mode = -1;
|
|
d3d->OMSetBlendState(gdraw->blend_no_color_write, four_zeros, ~0u);
|
|
}
|
|
}
|
|
|
|
set_pixel_shader(d3d, program);
|
|
} else
|
|
set_pixel_shader(d3d, gdraw->exceptional_blend[r->special_blend].pshader);
|
|
|
|
set_texture(0, r->tex[0], r->nearest0, r->wrap0);
|
|
|
|
// pixel shader constants
|
|
if (PixelCommonVars *pvars = (PixelCommonVars *) map_buffer(gdraw->d3d_context, gdraw->cb_ps_common, true)) {
|
|
memcpy(pvars->color_mul, r->color, 4*sizeof(float));
|
|
|
|
if (r->cxf_add) {
|
|
pvars->color_add[0] = r->cxf_add[0] / 255.0f;
|
|
pvars->color_add[1] = r->cxf_add[1] / 255.0f;
|
|
pvars->color_add[2] = r->cxf_add[2] / 255.0f;
|
|
pvars->color_add[3] = r->cxf_add[3] / 255.0f;
|
|
} else
|
|
pvars->color_add[0] = pvars->color_add[1] = pvars->color_add[2] = pvars->color_add[3] = 0.0f;
|
|
|
|
if (r->tex0_mode == GDRAW_TEXTURE_focal_gradient) memcpy(pvars->focal, r->focal_point, 4*sizeof(float));
|
|
if (r->blend_mode == GDRAW_BLEND_special) memcpy(pvars->rescale1, rescale1, 4*sizeof(float));
|
|
unmap_buffer(gdraw->d3d_context, gdraw->cb_ps_common);
|
|
d3d->PSSetConstantBuffers(0, 1, &gdraw->cb_ps_common);
|
|
}
|
|
|
|
// Set pixel operation states
|
|
if (r->scissor) {
|
|
D3D1X_(RECT) s;
|
|
gdraw->scissor_state = 1;
|
|
if (gdraw->cur == gdraw->frame) {
|
|
s.left = r->scissor_rect.x0 + gdraw->vx - gdraw->tx0;
|
|
s.top = r->scissor_rect.y0 + gdraw->vy - gdraw->ty0;
|
|
s.right = r->scissor_rect.x1 + gdraw->vx - gdraw->tx0;
|
|
s.bottom = r->scissor_rect.y1 + gdraw->vy - gdraw->ty0;
|
|
} else {
|
|
s.left = r->scissor_rect.x0 - gdraw->tx0p;
|
|
s.top = r->scissor_rect.y0 - gdraw->ty0p;
|
|
s.right = r->scissor_rect.x1 - gdraw->tx0p;
|
|
s.bottom = r->scissor_rect.y1 - gdraw->ty0p;
|
|
}
|
|
d3d->RSSetScissorRects(1, &s);
|
|
} else if (r->scissor != gdraw->scissor_state)
|
|
disable_scissor(0);
|
|
|
|
if (r->stencil_set | r->stencil_test)
|
|
d3d->OMSetDepthStencilState(stencil_state_cache_lookup(r->set_id, r->test_id, r->stencil_test, r->stencil_set), 255);
|
|
else
|
|
d3d->OMSetDepthStencilState(gdraw->depth_state[r->set_id][r->test_id], 0);
|
|
|
|
return 1;
|
|
}
|
|
|
|
static RADINLINE int set_renderstate(S32 vertex_format, GDrawRenderState *r, GDrawStats *stats)
|
|
{
|
|
static const F32 unit_rescale[4] = { 1.0f, 1.0f, 0.0f, 0.0f };
|
|
if (r->identical_state) {
|
|
// fast path: only need to change vertex shader, other state is the same
|
|
set_vertex_shader(gdraw->d3d_context, gdraw->vert[vertex_format].vshader);
|
|
return 1;
|
|
} else
|
|
return set_renderstate_full(vertex_format, r, stats, unit_rescale);
|
|
}
|
|
|
|
////////////////////////////////////////////////////////////////////////
|
|
//
|
|
// Vertex formats
|
|
//
|
|
|
|
static D3D1X_(INPUT_ELEMENT_DESC) vformat_v2[] = {
|
|
{ "POSITION", 0, DXGI_FORMAT_R32G32_FLOAT, 0, 0, D3D1X_(INPUT_PER_VERTEX_DATA), 0 },
|
|
};
|
|
|
|
static D3D1X_(INPUT_ELEMENT_DESC) vformat_v2aa[] = {
|
|
{ "POSITION", 0, DXGI_FORMAT_R32G32_FLOAT, 0, 0, D3D1X_(INPUT_PER_VERTEX_DATA), 0 },
|
|
{ "TEXCOORD", 0, DXGI_FORMAT_R16G16B16A16_SINT, 0, 8, D3D1X_(INPUT_PER_VERTEX_DATA), 0 },
|
|
};
|
|
|
|
static D3D1X_(INPUT_ELEMENT_DESC) vformat_v2tc2[] = {
|
|
{ "POSITION", 0, DXGI_FORMAT_R32G32_FLOAT, 0, 0, D3D1X_(INPUT_PER_VERTEX_DATA), 0 },
|
|
{ "TEXCOORD", 0, DXGI_FORMAT_R32G32_FLOAT, 0, 8, D3D1X_(INPUT_PER_VERTEX_DATA), 0 },
|
|
};
|
|
|
|
static struct gdraw_vertex_format_desc {
|
|
D3D1X_(INPUT_ELEMENT_DESC) *desc;
|
|
U32 nelem;
|
|
} vformats[ASSERT_COUNT(GDRAW_vformat__basic_count, 3)] = {
|
|
vformat_v2, 1, // GDRAW_vformat_v2
|
|
vformat_v2aa, 2, // GDRAW_vformat_v2aa
|
|
vformat_v2tc2, 2, // GDRAW_vforamt_v2tc2
|
|
};
|
|
|
|
static int vertsize[GDRAW_vformat__basic_count] = {
|
|
8, // GDRAW_vformat_v2
|
|
16, // GDRAW_vformat_v2aa
|
|
16, // GDRAW_vformat_v2tc2
|
|
};
|
|
|
|
////////////////////////////////////////////////////////////////////////
|
|
//
|
|
// Draw triangles with a given renderstate
|
|
//
|
|
|
|
static void tag_resources(void *r1, void *r2=NULL, void *r3=NULL, void *r4=NULL)
|
|
{
|
|
U64 now = gdraw->frame_counter;
|
|
if (r1) ((GDrawHandle *) r1)->fence.value = now;
|
|
if (r2) ((GDrawHandle *) r2)->fence.value = now;
|
|
if (r3) ((GDrawHandle *) r3)->fence.value = now;
|
|
if (r4) ((GDrawHandle *) r4)->fence.value = now;
|
|
}
|
|
|
|
static void RADLINK gdraw_DrawIndexedTriangles(GDrawRenderState *r, GDrawPrimitive *p, GDrawVertexBuffer *buf, GDrawStats *stats)
|
|
{
|
|
ID3D1XContext *d3d = gdraw->d3d_context;
|
|
GDrawHandle *vb = (GDrawHandle *) buf;
|
|
int vfmt = p->vertex_format;
|
|
assert(vfmt >= 0 && vfmt < GDRAW_vformat__count);
|
|
|
|
if (!set_renderstate(vfmt, r, stats))
|
|
return;
|
|
|
|
UINT stride = vertsize[vfmt];
|
|
d3d->IASetInputLayout(gdraw->inlayout[vfmt]);
|
|
|
|
if (vb) {
|
|
UINT offs = (UINT) (UINTa) p->vertices;
|
|
|
|
d3d->IASetVertexBuffers(0, 1, &vb->handle.vbuf.verts, &stride, &offs);
|
|
d3d->IASetIndexBuffer(vb->handle.vbuf.inds, DXGI_FORMAT_R16_UINT, (UINT) (UINTa) p->indices);
|
|
d3d->DrawIndexed(p->num_indices, 0, 0);
|
|
} else if (p->indices) {
|
|
U32 vbytes = p->num_vertices * stride;
|
|
U32 ibytes = p->num_indices * 2;
|
|
|
|
if (void *vbptr = start_write_dyn(&gdraw->dyn_vb, vbytes)) {
|
|
memcpy(vbptr, p->vertices, vbytes);
|
|
UINT vboffs = end_write_dyn(&gdraw->dyn_vb);
|
|
|
|
if (void *ibptr = start_write_dyn(&gdraw->dyn_ib, ibytes)) {
|
|
memcpy(ibptr, p->indices, ibytes);
|
|
UINT iboffs = end_write_dyn(&gdraw->dyn_ib);
|
|
|
|
d3d->IASetVertexBuffers(0, 1, &gdraw->dyn_vb.buffer, &stride, &vboffs);
|
|
d3d->IASetIndexBuffer(gdraw->dyn_ib.buffer, DXGI_FORMAT_R16_UINT, iboffs);
|
|
d3d->DrawIndexed(p->num_indices, 0, 0);
|
|
}
|
|
}
|
|
} else { // dynamic quads
|
|
assert(p->num_vertices % 4 == 0);
|
|
d3d->IASetIndexBuffer(gdraw->quad_ib, DXGI_FORMAT_R16_UINT, 0);
|
|
|
|
if (gdraw->max_quad_vert_count) {
|
|
S32 pos = 0;
|
|
while (pos < p->num_vertices) {
|
|
S32 vert_count = RR_MIN(p->num_vertices - pos, gdraw->max_quad_vert_count);
|
|
UINT chunk_bytes = vert_count * stride;
|
|
|
|
if (void *vbptr = start_write_dyn(&gdraw->dyn_vb, chunk_bytes)) {
|
|
memcpy(vbptr, (U8 *)p->vertices + pos*stride, chunk_bytes);
|
|
UINT offs = end_write_dyn(&gdraw->dyn_vb);
|
|
|
|
d3d->IASetVertexBuffers(0, 1, &gdraw->dyn_vb.buffer, &stride, &offs);
|
|
d3d->DrawIndexed((vert_count >> 2) * 6, 0, 0);
|
|
}
|
|
pos += vert_count;
|
|
}
|
|
}
|
|
}
|
|
|
|
tag_resources(vb, r->tex[0], r->tex[1]);
|
|
|
|
stats->nonzero_flags |= GDRAW_STATS_batches;
|
|
stats->num_batches += 1;
|
|
stats->drawn_indices += p->num_indices;
|
|
stats->drawn_vertices += p->num_vertices;
|
|
}
|
|
|
|
///////////////////////////////////////////////////////////////////////
|
|
//
|
|
// Flash 8 filter effects
|
|
//
|
|
|
|
static void *start_ps_constants(ID3D1X(Buffer) *buffer)
|
|
{
|
|
return map_buffer(gdraw->d3d_context, buffer, true);
|
|
}
|
|
|
|
static void end_ps_constants(ID3D1X(Buffer) *buffer)
|
|
{
|
|
unmap_buffer(gdraw->d3d_context, buffer);
|
|
gdraw->d3d_context->PSSetConstantBuffers(1, 1, &buffer);
|
|
}
|
|
|
|
static void set_pixel_constant(F32 *constant, F32 x, F32 y, F32 z, F32 w)
|
|
{
|
|
constant[0] = x;
|
|
constant[1] = y;
|
|
constant[2] = z;
|
|
constant[3] = w;
|
|
}
|
|
|
|
// caller sets up texture coordinates
|
|
static void do_screen_quad(gswf_recti *s, const F32 *tc, GDrawStats *stats)
|
|
{
|
|
ID3D1XContext *d3d = gdraw->d3d_context;
|
|
F32 px0 = (F32) s->x0, py0 = (F32) s->y0, px1 = (F32) s->x1, py1 = (F32) s->y1;
|
|
|
|
// generate vertex data
|
|
gswf_vertex_xyst *vert = (gswf_vertex_xyst *) start_write_dyn(&gdraw->dyn_vb, 4 * sizeof(gswf_vertex_xyst));
|
|
if (!vert)
|
|
return;
|
|
|
|
vert[0].x = px0; vert[0].y = py0; vert[0].s = tc[0]; vert[0].t = tc[1];
|
|
vert[1].x = px1; vert[1].y = py0; vert[1].s = tc[2]; vert[1].t = tc[1];
|
|
vert[2].x = px0; vert[2].y = py1; vert[2].s = tc[0]; vert[2].t = tc[3];
|
|
vert[3].x = px1; vert[3].y = py1; vert[3].s = tc[2]; vert[3].t = tc[3];
|
|
UINT offs = end_write_dyn(&gdraw->dyn_vb);
|
|
UINT stride = sizeof(gswf_vertex_xyst);
|
|
|
|
if (VertexVars *vvars = (VertexVars *) map_buffer(gdraw->d3d_context, gdraw->cb_vertex, true)) {
|
|
gdraw_PixelSpace(vvars->world[0]);
|
|
memcpy(vvars->x3d, gdraw->projmat, 12*sizeof(F32));
|
|
unmap_buffer(gdraw->d3d_context, gdraw->cb_vertex);
|
|
d3d->VSSetConstantBuffers(0, 1, &gdraw->cb_vertex);
|
|
|
|
set_vertex_shader(d3d, gdraw->vert[GDRAW_vformat_v2tc2].vshader);
|
|
|
|
d3d->IASetInputLayout(gdraw->inlayout[GDRAW_vformat_v2tc2]);
|
|
d3d->IASetVertexBuffers(0, 1, &gdraw->dyn_vb.buffer, &stride, &offs);
|
|
d3d->IASetPrimitiveTopology(D3D1X_(PRIMITIVE_TOPOLOGY_TRIANGLESTRIP));
|
|
d3d->Draw(4, 0);
|
|
d3d->IASetPrimitiveTopology(D3D1X_(PRIMITIVE_TOPOLOGY_TRIANGLELIST));
|
|
|
|
stats->nonzero_flags |= GDRAW_STATS_batches;
|
|
stats->num_batches += 1;
|
|
stats->drawn_indices += 6;
|
|
stats->drawn_vertices += 4;
|
|
}
|
|
}
|
|
|
|
static void manual_clear(gswf_recti *r, GDrawStats *stats)
|
|
{
|
|
ID3D1XContext *d3d = gdraw->d3d_context;
|
|
|
|
// go to known render state
|
|
d3d->OMSetBlendState(gdraw->blend_state[GDRAW_BLEND_none], four_zeros, ~0u);
|
|
d3d->OMSetDepthStencilState(gdraw->depth_state[0][0], 0);
|
|
gdraw->blend_mode = GDRAW_BLEND_none;
|
|
|
|
set_viewport_raw(0, 0, gdraw->frametex_width, gdraw->frametex_height);
|
|
set_projection_raw(0, gdraw->frametex_width, gdraw->frametex_height, 0);
|
|
set_pixel_shader(d3d, gdraw->clear_ps.pshader);
|
|
|
|
if (PixelCommonVars *pvars = (PixelCommonVars *) map_buffer(gdraw->d3d_context, gdraw->cb_ps_common, true)) {
|
|
memset(pvars, 0, sizeof(*pvars));
|
|
unmap_buffer(gdraw->d3d_context, gdraw->cb_ps_common);
|
|
d3d->PSSetConstantBuffers(0, 1, &gdraw->cb_ps_common);
|
|
|
|
do_screen_quad(r, four_zeros, stats);
|
|
}
|
|
}
|
|
|
|
static void gdraw_DriverBlurPass(GDrawRenderState *r, int taps, float *data, gswf_recti *s, float *tc, float /*height_max*/, float *clamp, GDrawStats *gstats)
|
|
{
|
|
set_texture(0, r->tex[0], false, GDRAW_WRAP_clamp);
|
|
|
|
set_pixel_shader(gdraw->d3d_context, gdraw->blur_prog[taps].pshader);
|
|
PixelParaBlur *para = (PixelParaBlur *) start_ps_constants(gdraw->cb_blur);
|
|
memcpy(para->clamp, clamp, 4 * sizeof(float));
|
|
memcpy(para->tap, data, taps * 4 * sizeof(float));
|
|
end_ps_constants(gdraw->cb_blur);
|
|
|
|
do_screen_quad(s, tc, gstats);
|
|
tag_resources(r->tex[0]);
|
|
}
|
|
|
|
static void gdraw_Colormatrix(GDrawRenderState *r, gswf_recti *s, float *tc, GDrawStats *stats)
|
|
{
|
|
if (!gdraw_TextureDrawBufferBegin(s, GDRAW_TEXTURE_FORMAT_rgba32, GDRAW_TEXTUREDRAWBUFFER_FLAGS_needs_color | GDRAW_TEXTUREDRAWBUFFER_FLAGS_needs_alpha, 0, stats))
|
|
return;
|
|
|
|
set_texture(0, r->tex[0], false, GDRAW_WRAP_clamp);
|
|
set_pixel_shader(gdraw->d3d_context, gdraw->colormatrix.pshader);
|
|
|
|
PixelParaColorMatrix *para = (PixelParaColorMatrix *) start_ps_constants(gdraw->cb_colormatrix);
|
|
memcpy(para->data, r->shader_data, 5 * 4 * sizeof(float));
|
|
end_ps_constants(gdraw->cb_colormatrix);
|
|
|
|
do_screen_quad(s, tc, stats);
|
|
tag_resources(r->tex[0]);
|
|
r->tex[0] = gdraw_TextureDrawBufferEnd(stats);
|
|
}
|
|
|
|
static gswf_recti *get_valid_rect(GDrawTexture *tex)
|
|
{
|
|
GDrawHandle *h = (GDrawHandle *) tex;
|
|
S32 n = (S32) (h - gdraw->rendertargets.handle);
|
|
assert(n >= 0 && n <= MAX_RENDER_STACK_DEPTH+1);
|
|
return &gdraw->rt_valid[n];
|
|
}
|
|
|
|
static void set_clamp_constant(F32 *constant, GDrawTexture *tex)
|
|
{
|
|
gswf_recti *s = get_valid_rect(tex);
|
|
// when we make the valid data, we make sure there is an extra empty pixel at the border
|
|
set_pixel_constant(constant,
|
|
(s->x0-0.5f) / gdraw->frametex_width,
|
|
(s->y0-0.5f) / gdraw->frametex_height,
|
|
(s->x1+0.5f) / gdraw->frametex_width,
|
|
(s->y1+0.5f) / gdraw->frametex_height);
|
|
}
|
|
|
|
static void gdraw_Filter(GDrawRenderState *r, gswf_recti *s, float *tc, int isbevel, GDrawStats *stats)
|
|
{
|
|
if (!gdraw_TextureDrawBufferBegin(s, GDRAW_TEXTURE_FORMAT_rgba32, GDRAW_TEXTUREDRAWBUFFER_FLAGS_needs_color | GDRAW_TEXTUREDRAWBUFFER_FLAGS_needs_alpha, NULL, stats))
|
|
return;
|
|
|
|
set_texture(0, r->tex[0], false, GDRAW_WRAP_clamp);
|
|
set_texture(1, r->tex[1], false, GDRAW_WRAP_clamp);
|
|
set_texture(2, r->tex[2], false, GDRAW_WRAP_clamp);
|
|
set_pixel_shader(gdraw->d3d_context, gdraw->filter_prog[isbevel][r->filter_mode].pshader);
|
|
|
|
PixelParaFilter *para = (PixelParaFilter *) start_ps_constants(gdraw->cb_filter);
|
|
set_clamp_constant(para->clamp0, r->tex[0]);
|
|
set_clamp_constant(para->clamp1, r->tex[1]);
|
|
set_pixel_constant(para->color, r->shader_data[0], r->shader_data[1], r->shader_data[2], r->shader_data[3]);
|
|
set_pixel_constant(para->color2, r->shader_data[8], r->shader_data[9], r->shader_data[10], r->shader_data[11]);
|
|
set_pixel_constant(para->tc_off, -r->shader_data[4] / (F32)gdraw->frametex_width, -r->shader_data[5] / (F32)gdraw->frametex_height, r->shader_data[6], 0);
|
|
end_ps_constants(gdraw->cb_filter);
|
|
|
|
do_screen_quad(s, tc, stats);
|
|
tag_resources(r->tex[0], r->tex[1], r->tex[2]);
|
|
r->tex[0] = gdraw_TextureDrawBufferEnd(stats);
|
|
}
|
|
|
|
static void RADLINK gdraw_FilterQuad(GDrawRenderState *r, S32 x0, S32 y0, S32 x1, S32 y1, GDrawStats *stats)
|
|
{
|
|
ID3D1XContext *d3d = gdraw->d3d_context;
|
|
F32 tc[4];
|
|
gswf_recti s;
|
|
|
|
// clip to tile boundaries
|
|
s.x0 = RR_MAX(x0, gdraw->tx0p);
|
|
s.y0 = RR_MAX(y0, gdraw->ty0p);
|
|
s.x1 = RR_MIN(x1, gdraw->tx0p + gdraw->tpw);
|
|
s.y1 = RR_MIN(y1, gdraw->ty0p + gdraw->tph);
|
|
if (s.x1 < s.x0 || s.y1 < s.y0)
|
|
return;
|
|
|
|
tc[0] = (s.x0 - gdraw->tx0p) / (F32) gdraw->frametex_width;
|
|
tc[1] = (s.y0 - gdraw->ty0p) / (F32) gdraw->frametex_height;
|
|
tc[2] = (s.x1 - gdraw->tx0p) / (F32) gdraw->frametex_width;
|
|
tc[3] = (s.y1 - gdraw->ty0p) / (F32) gdraw->frametex_height;
|
|
|
|
// clear to known render state
|
|
d3d->OMSetBlendState(gdraw->blend_state[GDRAW_BLEND_none], four_zeros, ~0u);
|
|
d3d->OMSetDepthStencilState(gdraw->depth_state[0][0], 0);
|
|
disable_scissor(0);
|
|
gdraw->blend_mode = GDRAW_BLEND_none;
|
|
|
|
if (r->blend_mode == GDRAW_BLEND_filter) {
|
|
switch (r->filter) {
|
|
case GDRAW_FILTER_blur: {
|
|
GDrawBlurInfo b;
|
|
gswf_recti bounds = *get_valid_rect(r->tex[0]);
|
|
gdraw_ShiftRect(&s, &s, -gdraw->tx0p, -gdraw->ty0p); // blur uses physical rendertarget coordinates
|
|
|
|
b.BlurPass = gdraw_DriverBlurPass;
|
|
b.w = gdraw->tpw;
|
|
b.h = gdraw->tph;
|
|
b.frametex_width = gdraw->frametex_width;
|
|
b.frametex_height = gdraw->frametex_height;
|
|
|
|
// blur needs to draw with multiple passes, so set up special state
|
|
gdraw->in_blur = true;
|
|
|
|
// do the blur
|
|
gdraw_Blur(&gdraw_funcs, &b, r, &s, &bounds, stats);
|
|
|
|
// restore the normal state
|
|
gdraw->in_blur = false;
|
|
set_viewport();
|
|
set_projection();
|
|
break;
|
|
}
|
|
|
|
case GDRAW_FILTER_colormatrix:
|
|
gdraw_Colormatrix(r, &s, tc, stats);
|
|
break;
|
|
|
|
case GDRAW_FILTER_dropshadow:
|
|
gdraw_Filter(r, &s, tc, 0, stats);
|
|
break;
|
|
|
|
case GDRAW_FILTER_bevel:
|
|
gdraw_Filter(r, &s, tc, 1, stats);
|
|
break;
|
|
|
|
default:
|
|
assert(0);
|
|
}
|
|
} else {
|
|
GDrawHandle *blend_tex = NULL;
|
|
|
|
// for crazy blend modes, we need to read back from the framebuffer
|
|
// and do the blending in the pixel shader. we do this with copies
|
|
// rather than trying to render-to-texture-all-along, because we want
|
|
// to be able to render over the user's existing framebuffer, which might
|
|
// not be a texture. note that this isn't optimal when MSAA is on!
|
|
F32 rescale1[4] = { 1.0f, 1.0f, 0.0f, 0.0f };
|
|
if (r->blend_mode == GDRAW_BLEND_special) {
|
|
ID3D1XContext *d3d = gdraw->d3d_context;
|
|
ID3D1X(Resource) *cur_rt_rsrc;
|
|
get_active_render_target()->GetResource(&cur_rt_rsrc);
|
|
|
|
if (gdraw->cur == gdraw->frame && gdraw->main_msaa) {
|
|
// source surface is main framebuffer and it uses MSAA. just resolve it first.
|
|
D3D1X_(SHADER_RESOURCE_VIEW_DESC) desc;
|
|
D3D1X_(TEXTURE2D_DESC) texdesc;
|
|
ID3D1X(Texture2D) *resolve_tex;
|
|
|
|
gdraw->main_resolve_target->GetDesc(&desc);
|
|
gdraw->main_resolve_target->GetResource((ID3D1X(Resource) **) &resolve_tex);
|
|
resolve_tex->GetDesc(&texdesc);
|
|
d3d->ResolveSubresource(resolve_tex, 0, cur_rt_rsrc, 0, desc.Format);
|
|
resolve_tex->Release();
|
|
|
|
stats->nonzero_flags |= GDRAW_STATS_blits;
|
|
stats->num_blits += 1;
|
|
stats->num_blit_pixels += texdesc.Width * texdesc.Height;
|
|
|
|
d3d->PSSetShaderResources(1, 1, &gdraw->main_resolve_target);
|
|
d3d->PSSetSamplers(1, 1, &gdraw->sampler_state[0][GDRAW_WRAP_clamp]);
|
|
|
|
// calculate texture coordinate remapping
|
|
rescale1[0] = gdraw->frametex_width / (F32) texdesc.Width;
|
|
rescale1[1] = gdraw->frametex_height / (F32) texdesc.Height;
|
|
rescale1[2] = (gdraw->vx - gdraw->tx0 + gdraw->tx0p) / (F32) texdesc.Width;
|
|
rescale1[3] = (gdraw->vy - gdraw->ty0 + gdraw->ty0p) / (F32) texdesc.Height;
|
|
} else {
|
|
D3D1X_(BOX) box = { 0,0,0,0,0,1 };
|
|
S32 dx = 0, dy = 0;
|
|
blend_tex = get_color_rendertarget(stats);
|
|
|
|
if (gdraw->cur != gdraw->frame)
|
|
box.right=gdraw->tpw, box.bottom=gdraw->tph;
|
|
else {
|
|
box.left=gdraw->vx, box.top=gdraw->vy, box.right=gdraw->vx+gdraw->tw, box.bottom=gdraw->vy+gdraw->th;
|
|
dx = gdraw->tx0 - gdraw->tx0p;
|
|
dy = gdraw->ty0 - gdraw->ty0p;
|
|
}
|
|
|
|
d3d->CopySubresourceRegion(blend_tex->handle.tex.d3d, 0, dx, dy, 0,
|
|
cur_rt_rsrc, 0, &box);
|
|
|
|
stats->nonzero_flags |= GDRAW_STATS_blits;
|
|
stats->num_blits += 1;
|
|
stats->num_blit_pixels += (box.right - box.left) * (box.bottom - box.top);
|
|
|
|
set_texture(1, (GDrawTexture *) blend_tex, false, GDRAW_WRAP_clamp);
|
|
}
|
|
|
|
cur_rt_rsrc->Release();
|
|
}
|
|
|
|
if (!set_renderstate_full(GDRAW_vformat_v2tc2, r, stats, rescale1))
|
|
return;
|
|
|
|
do_screen_quad(&s, tc, stats);
|
|
tag_resources(r->tex[0], r->tex[1]);
|
|
if (blend_tex)
|
|
gdraw_FreeTexture((GDrawTexture *) blend_tex, 0, stats);
|
|
}
|
|
}
|
|
|
|
///////////////////////////////////////////////////////////////////////
|
|
//
|
|
// Shaders and state
|
|
//
|
|
|
|
#include GDRAW_SHADER_FILE
|
|
|
|
static void destroy_shader(ProgramWithCachedVariableLocations *p)
|
|
{
|
|
if (p->pshader) {
|
|
p->pshader->Release();
|
|
p->pshader = NULL;
|
|
}
|
|
}
|
|
|
|
static ID3D1X(Buffer) *create_dynamic_buffer(U32 size, U32 bind)
|
|
{
|
|
D3D1X_(BUFFER_DESC) desc = { size, D3D1X_(USAGE_DYNAMIC), bind, D3D1X_(CPU_ACCESS_WRITE), 0 };
|
|
ID3D1X(Buffer) *buf = NULL;
|
|
HRESULT hr = gdraw->d3d_device->CreateBuffer(&desc, NULL, &buf);
|
|
if (FAILED(hr)) {
|
|
report_d3d_error(hr, "CreateBuffer", " creating dynamic vertex buffer");
|
|
buf = NULL;
|
|
}
|
|
return buf;
|
|
}
|
|
|
|
static void init_dyn_buffer(DynBuffer *buf, U32 size, U32 bind)
|
|
{
|
|
buf->buffer = create_dynamic_buffer(size, bind);
|
|
buf->size = size;
|
|
buf->write_pos = 0;
|
|
buf->alloc_pos = 0;
|
|
}
|
|
|
|
// These two functions are implemented by the D3D10- respectively D3D11-specific part.
|
|
static void create_pixel_shader(ProgramWithCachedVariableLocations *p, ProgramWithCachedVariableLocations *src);
|
|
static void create_vertex_shader(ProgramWithCachedVariableLocations *p, ProgramWithCachedVariableLocations *src);
|
|
|
|
static void create_all_shaders_and_state(void)
|
|
{
|
|
ID3D1X(Device) *d3d = gdraw->d3d_device;
|
|
HRESULT hr;
|
|
S32 i, j;
|
|
|
|
for (i=0; i < GDRAW_TEXTURE__count*3; ++i) create_pixel_shader(&gdraw->fprog[0][i], pshader_basic_arr + i);
|
|
for (i=0; i < GDRAW_BLENDSPECIAL__count; ++i) create_pixel_shader(&gdraw->exceptional_blend[i], pshader_exceptional_blend_arr + i);
|
|
for (i=0; i < 32; ++i) create_pixel_shader(&gdraw->filter_prog[0][i], pshader_filter_arr + i);
|
|
for (i=0; i < MAX_TAPS+1; ++i) create_pixel_shader(&gdraw->blur_prog[i], pshader_blur_arr + i);
|
|
create_pixel_shader(&gdraw->colormatrix, pshader_color_matrix_arr);
|
|
create_pixel_shader(&gdraw->clear_ps, pshader_manual_clear_arr);
|
|
|
|
for (i=0; i < GDRAW_vformat__basic_count; i++) {
|
|
ProgramWithCachedVariableLocations *vsh = vshader_vsd3d10_arr + i;
|
|
|
|
create_vertex_shader(&gdraw->vert[i], vsh);
|
|
HRESULT hr = d3d->CreateInputLayout(vformats[i].desc, vformats[i].nelem, vsh->bytecode, vsh->size, &gdraw->inlayout[i]);
|
|
if (FAILED(hr)) {
|
|
report_d3d_error(hr, "CreateInputLayout", "");
|
|
gdraw->inlayout[i] = NULL;
|
|
}
|
|
}
|
|
|
|
// create rasterizer state setups
|
|
for (i=0; i < 2; ++i) {
|
|
D3D1X_(RASTERIZER_DESC) raster_desc = { D3D1X_(FILL_SOLID), D3D1X_(CULL_NONE), FALSE, 0, 0.0f, 0.0f, TRUE, TRUE, FALSE, FALSE };
|
|
raster_desc.MultisampleEnable = i;
|
|
hr = d3d->CreateRasterizerState(&raster_desc, &gdraw->raster_state[i]);
|
|
if (FAILED(hr)) {
|
|
report_d3d_error(hr, "CreateRasterizerState", "");
|
|
return;
|
|
}
|
|
}
|
|
|
|
// create sampler state setups
|
|
static const D3D1X_(TEXTURE_ADDRESS_MODE) addrmode[ASSERT_COUNT(GDRAW_WRAP__count, 4)] = {
|
|
D3D1X_(TEXTURE_ADDRESS_CLAMP), // GDRAW_WRAP_clamp
|
|
D3D1X_(TEXTURE_ADDRESS_WRAP), // GDRAW_WRAP_repeat
|
|
D3D1X_(TEXTURE_ADDRESS_MIRROR), // GDRAW_WRAP_mirror
|
|
D3D1X_(TEXTURE_ADDRESS_CLAMP), // GDRAW_WRAP_clamp_to_border (unused for this renderer)
|
|
};
|
|
|
|
for (i=0; i < 2; ++i) {
|
|
for (j=0; j < GDRAW_WRAP__count; ++j) {
|
|
D3D1X_(SAMPLER_DESC) sampler_desc;
|
|
memset(&sampler_desc, 0, sizeof(sampler_desc));
|
|
sampler_desc.Filter = i ? D3D1X_(FILTER_MIN_LINEAR_MAG_MIP_POINT) : D3D1X_(FILTER_MIN_MAG_MIP_LINEAR);
|
|
sampler_desc.AddressU = addrmode[j];
|
|
sampler_desc.AddressV = addrmode[j];
|
|
sampler_desc.AddressW = D3D1X_(TEXTURE_ADDRESS_CLAMP);
|
|
sampler_desc.MaxAnisotropy = 1;
|
|
sampler_desc.MaxLOD = D3D1X_(FLOAT32_MAX);
|
|
hr = d3d->CreateSamplerState(&sampler_desc, &gdraw->sampler_state[i][j]);
|
|
if (FAILED(hr)) {
|
|
report_d3d_error(hr, "CreateSamplerState", "");
|
|
return;
|
|
}
|
|
}
|
|
}
|
|
|
|
// create blend stage setups
|
|
static struct blendspec {
|
|
BOOL blend;
|
|
D3D1X_(BLEND) src;
|
|
D3D1X_(BLEND) dst;
|
|
} blends[ASSERT_COUNT(GDRAW_BLEND__count, 6)] = {
|
|
FALSE, D3D1X_(BLEND_ONE), D3D1X_(BLEND_ZERO), // GDRAW_BLEND_none
|
|
TRUE, D3D1X_(BLEND_ONE), D3D1X_(BLEND_INV_SRC_ALPHA), // GDRAW_BLEND_alpha
|
|
TRUE, D3D1X_(BLEND_DEST_COLOR), D3D1X_(BLEND_INV_SRC_ALPHA), // GDRAW_BLEND_multiply
|
|
TRUE, D3D1X_(BLEND_ONE), D3D1X_(BLEND_ONE), // GDRAW_BLEND_add
|
|
|
|
FALSE, D3D1X_(BLEND_ONE), D3D1X_(BLEND_ZERO), // GDRAW_BLEND_filter
|
|
FALSE, D3D1X_(BLEND_ONE), D3D1X_(BLEND_ZERO), // GDRAW_BLEND_special
|
|
};
|
|
|
|
for (i=0; i < GDRAW_BLEND__count; ++i) {
|
|
gdraw->blend_state[i] = create_blend_state(d3d, blends[i].blend, blends[i].src, blends[i].dst);
|
|
if (!gdraw->blend_state[i])
|
|
return;
|
|
}
|
|
|
|
D3D1X_(BLEND_DESC) blend_desc;
|
|
memset(&blend_desc, 0, sizeof(blend_desc));
|
|
hr = d3d->CreateBlendState(&blend_desc, &gdraw->blend_no_color_write);
|
|
if (FAILED(hr)) {
|
|
report_d3d_error(hr, "CreateBlendState", "");
|
|
return;
|
|
}
|
|
|
|
// create depth/stencil setups
|
|
for (i=0; i < 2; ++i) {
|
|
for (j=0; j < 2; ++j) {
|
|
D3D1X_(DEPTH_STENCIL_DESC) depth_desc;
|
|
memset(&depth_desc, 0, sizeof(depth_desc));
|
|
|
|
depth_desc.DepthEnable = (i || j);
|
|
depth_desc.DepthWriteMask = i ? D3D1X_(DEPTH_WRITE_MASK_ALL) : D3D1X_(DEPTH_WRITE_MASK_ZERO);
|
|
depth_desc.DepthFunc = j ? D3D1X_(COMPARISON_LESS) : D3D1X_(COMPARISON_ALWAYS);
|
|
depth_desc.StencilEnable = FALSE;
|
|
|
|
hr = d3d->CreateDepthStencilState(&depth_desc, &gdraw->depth_state[i][j]);
|
|
if (FAILED(hr)) {
|
|
report_d3d_error(hr, "CreateDepthStencilState", "");
|
|
return;
|
|
}
|
|
}
|
|
}
|
|
|
|
// constant buffers
|
|
gdraw->cb_vertex = create_dynamic_buffer(sizeof(VertexVars), D3D1X_(BIND_CONSTANT_BUFFER));
|
|
gdraw->cb_ps_common = create_dynamic_buffer(sizeof(PixelCommonVars), D3D1X_(BIND_CONSTANT_BUFFER));
|
|
gdraw->cb_filter = create_dynamic_buffer(sizeof(PixelParaFilter), D3D1X_(BIND_CONSTANT_BUFFER));
|
|
gdraw->cb_colormatrix = create_dynamic_buffer(sizeof(PixelParaColorMatrix), D3D1X_(BIND_CONSTANT_BUFFER));
|
|
gdraw->cb_blur = create_dynamic_buffer(sizeof(PixelParaBlur), D3D1X_(BIND_CONSTANT_BUFFER));
|
|
|
|
// quad index buffer
|
|
assert(QUAD_IB_COUNT * 4 < 65535); // can't use more; we have 16-bit index buffers and 0xffff = primitive cut index
|
|
U16 *inds = (U16 *) IggyGDrawMalloc(QUAD_IB_COUNT * 6 * sizeof(U16));
|
|
if (inds) {
|
|
D3D1X_(BUFFER_DESC) bufdesc = { };
|
|
D3D1X_(SUBRESOURCE_DATA) data = { inds, 0, 0 };
|
|
|
|
bufdesc.ByteWidth = QUAD_IB_COUNT * 6 * sizeof(U16);
|
|
bufdesc.Usage = D3D1X_(USAGE_IMMUTABLE);
|
|
bufdesc.BindFlags = D3D1X_(BIND_INDEX_BUFFER);
|
|
|
|
for (U16 i=0; i < QUAD_IB_COUNT; i++) {
|
|
inds[i*6 + 0] = i*4 + 0;
|
|
inds[i*6 + 1] = i*4 + 1;
|
|
inds[i*6 + 2] = i*4 + 2;
|
|
inds[i*6 + 3] = i*4 + 0;
|
|
inds[i*6 + 4] = i*4 + 2;
|
|
inds[i*6 + 5] = i*4 + 3;
|
|
}
|
|
|
|
hr = gdraw->d3d_device->CreateBuffer(&bufdesc, &data, &gdraw->quad_ib);
|
|
if (FAILED(hr)) {
|
|
report_d3d_error(hr, "CreateBuffer", " for constants");
|
|
gdraw->quad_ib = NULL;
|
|
}
|
|
IggyGDrawFree(inds);
|
|
} else
|
|
gdraw->quad_ib = NULL;
|
|
}
|
|
|
|
static void destroy_all_shaders_and_state()
|
|
{
|
|
S32 i;
|
|
|
|
for (i=0; i < GDRAW_TEXTURE__count*3; ++i) destroy_shader(&gdraw->fprog[0][i]);
|
|
for (i=0; i < GDRAW_BLENDSPECIAL__count; ++i) destroy_shader(&gdraw->exceptional_blend[i]);
|
|
for (i=0; i < 32; ++i) destroy_shader(&gdraw->filter_prog[0][i]);
|
|
for (i=0; i < MAX_TAPS+1; ++i) destroy_shader(&gdraw->blur_prog[i]);
|
|
destroy_shader(&gdraw->colormatrix);
|
|
destroy_shader(&gdraw->clear_ps);
|
|
|
|
for (i=0; i < GDRAW_vformat__basic_count; i++) {
|
|
safe_release(gdraw->inlayout[i]);
|
|
destroy_shader(&gdraw->vert[i]);
|
|
}
|
|
|
|
for (i=0; i < 2; ++i) safe_release(gdraw->raster_state[i]);
|
|
for (i=0; i < GDRAW_WRAP__count*2; ++i) safe_release(gdraw->sampler_state[0][i]);
|
|
for (i=0; i < GDRAW_BLEND__count; ++i) safe_release(gdraw->blend_state[i]);
|
|
for (i=0; i < 2*2; ++i) safe_release(gdraw->depth_state[0][i]);
|
|
|
|
safe_release(gdraw->blend_no_color_write);
|
|
|
|
safe_release(gdraw->cb_vertex);
|
|
safe_release(gdraw->cb_ps_common);
|
|
safe_release(gdraw->cb_filter);
|
|
safe_release(gdraw->cb_colormatrix);
|
|
safe_release(gdraw->cb_blur);
|
|
|
|
safe_release(gdraw->quad_ib);
|
|
}
|
|
|
|
////////////////////////////////////////////////////////////////////////
|
|
//
|
|
// Create and tear-down the state
|
|
//
|
|
|
|
typedef struct
|
|
{
|
|
S32 num_handles;
|
|
S32 num_bytes;
|
|
} GDrawResourceLimit;
|
|
|
|
// These are the defaults limits used by GDraw unless the user specifies something else.
|
|
static GDrawResourceLimit gdraw_limits[GDRAW_D3D1X_(RESOURCE__count)] = {
|
|
MAX_RENDER_STACK_DEPTH + 1, 16*1024*1024, // RESOURCE_rendertarget
|
|
500, 16*1024*1024, // RESOURCE_texture
|
|
1000, 2*1024*1024, // RESOURCE_vertexbuffer
|
|
0, 256*1024, // RESOURCE_dynbuffer
|
|
};
|
|
|
|
static GDrawHandleCache *make_handle_cache(gdraw_resourcetype type)
|
|
{
|
|
S32 num_handles = gdraw_limits[type].num_handles;
|
|
S32 num_bytes = gdraw_limits[type].num_bytes;
|
|
GDrawHandleCache *cache = (GDrawHandleCache *) IggyGDrawMalloc(sizeof(GDrawHandleCache) + (num_handles - 1) * sizeof(GDrawHandle));
|
|
if (cache) {
|
|
gdraw_HandleCacheInit(cache, num_handles, num_bytes);
|
|
cache->is_vertex = (type == GDRAW_D3D1X_(RESOURCE_vertexbuffer));
|
|
}
|
|
|
|
return cache;
|
|
}
|
|
|
|
static void free_gdraw()
|
|
{
|
|
if (!gdraw) return;
|
|
if (gdraw->texturecache) IggyGDrawFree(gdraw->texturecache);
|
|
if (gdraw->vbufcache) IggyGDrawFree(gdraw->vbufcache);
|
|
IggyGDrawFree(gdraw);
|
|
gdraw = NULL;
|
|
}
|
|
|
|
static bool alloc_dynbuffer(U32 size)
|
|
{
|
|
// specified input size is vertex buffer size. determine sensible size for the
|
|
// corresponding index buffer. iggy always uses 16-bit indices and has three
|
|
// primary types of geometry it sends:
|
|
//
|
|
// 1. filled polygons. these are triangulated simple polygons and thus have
|
|
// roughly as many triangles as they have vertices. they use either 8- or
|
|
// 16-byte vertex formats; this makes a worst case of 6 bytes of indices
|
|
// for every 8 bytes of vertex data.
|
|
// 2. strokes and edge antialiasing. they use a 16-byte vertex format and
|
|
// worst-case write a "double quadstrip" which has 4 triangles for every
|
|
// 3 vertices, which means 24 bytes of index data for every 48 bytes
|
|
// of vertex data.
|
|
// 3. textured quads. they use a 16-byte vertex format, have exactly 2
|
|
// triangles for every 4 vertices, and use either a static index buffer
|
|
// (quad_ib) or a single triangle strip, so for our purposes they need no
|
|
// space to store indices at all.
|
|
//
|
|
// 1) argues for allocating index buffers at 3/4 the size of the corresponding
|
|
// vertex buffer, while 2) and 3) need 1/2 the size of the vertex buffer or less.
|
|
// 2) and 3) are the most common types of vertex data, while 1) is used only for
|
|
// morphed shapes and in certain cases when the RESOURCE_vertexbuffer pool is full.
|
|
// we just play it safe anyway and make sure we size the IB large enough to cover
|
|
// the worst case for 1). this is conservative, but it probably doesn't matter much.
|
|
U32 ibsize = (size * 3) / 4;
|
|
|
|
init_dyn_buffer(&gdraw->dyn_vb, size, D3D1X_(BIND_VERTEX_BUFFER));
|
|
init_dyn_buffer(&gdraw->dyn_ib, ibsize, D3D1X_(BIND_INDEX_BUFFER));
|
|
|
|
gdraw->max_quad_vert_count = RR_MIN(size / sizeof(gswf_vertex_xyst), QUAD_IB_COUNT * 4);
|
|
gdraw->max_quad_vert_count &= ~3; // must be multiple of four
|
|
|
|
return gdraw->dyn_vb.buffer != NULL && gdraw->dyn_ib.buffer != NULL;
|
|
}
|
|
|
|
int gdraw_D3D1X_(SetResourceLimits)(gdraw_resourcetype type, S32 num_handles, S32 num_bytes)
|
|
{
|
|
GDrawStats stats={0};
|
|
|
|
if (type == GDRAW_D3D1X_(RESOURCE_rendertarget)) // RT count is small and space is preallocated
|
|
num_handles = MAX_RENDER_STACK_DEPTH + 1;
|
|
|
|
assert(type >= GDRAW_D3D1X_(RESOURCE_rendertarget) && type < GDRAW_D3D1X_(RESOURCE__count));
|
|
assert(num_handles >= 0);
|
|
assert(num_bytes >= 0);
|
|
|
|
// nothing to do if the values are unchanged
|
|
if (gdraw_limits[type].num_handles == num_handles &&
|
|
gdraw_limits[type].num_bytes == num_bytes)
|
|
return 1;
|
|
|
|
gdraw_limits[type].num_handles = num_handles;
|
|
gdraw_limits[type].num_bytes = num_bytes;
|
|
|
|
// if no gdraw context created, there's nothing to worry about
|
|
if (!gdraw)
|
|
return 1;
|
|
|
|
// resize the appropriate pool
|
|
switch (type) {
|
|
case GDRAW_D3D1X_(RESOURCE_rendertarget):
|
|
flush_rendertargets(&stats);
|
|
gdraw_HandleCacheInit(&gdraw->rendertargets, num_handles, num_bytes);
|
|
return 1;
|
|
|
|
case GDRAW_D3D1X_(RESOURCE_texture):
|
|
if (gdraw->texturecache) {
|
|
gdraw_res_flush(gdraw->texturecache, &stats);
|
|
IggyGDrawFree(gdraw->texturecache);
|
|
}
|
|
gdraw->texturecache = make_handle_cache(GDRAW_D3D1X_(RESOURCE_texture));
|
|
return gdraw->texturecache != NULL;
|
|
|
|
case GDRAW_D3D1X_(RESOURCE_vertexbuffer):
|
|
if (gdraw->vbufcache) {
|
|
gdraw_res_flush(gdraw->vbufcache, &stats);
|
|
IggyGDrawFree(gdraw->vbufcache);
|
|
}
|
|
gdraw->vbufcache = make_handle_cache(GDRAW_D3D1X_(RESOURCE_vertexbuffer));
|
|
return gdraw->vbufcache != NULL;
|
|
|
|
case GDRAW_D3D1X_(RESOURCE_dynbuffer):
|
|
unbind_resources();
|
|
safe_release(gdraw->dyn_vb.buffer);
|
|
safe_release(gdraw->dyn_ib.buffer);
|
|
return alloc_dynbuffer(num_bytes);
|
|
|
|
default:
|
|
return 0;
|
|
}
|
|
}
|
|
|
|
static GDrawFunctions *create_context(ID3D1XDevice *dev, ID3D1XContext *ctx, S32 w, S32 h)
|
|
{
|
|
gdraw = (GDraw *) IggyGDrawMalloc(sizeof(*gdraw));
|
|
if (!gdraw) return NULL;
|
|
|
|
memset(gdraw, 0, sizeof(*gdraw));
|
|
|
|
gdraw->frametex_width = w;
|
|
gdraw->frametex_height = h;
|
|
gdraw->d3d_device = dev;
|
|
gdraw->d3d_context = ctx;
|
|
|
|
gdraw->texturecache = make_handle_cache(GDRAW_D3D1X_(RESOURCE_texture));
|
|
gdraw->vbufcache = make_handle_cache(GDRAW_D3D1X_(RESOURCE_vertexbuffer));
|
|
gdraw_HandleCacheInit(&gdraw->rendertargets, gdraw_limits[GDRAW_D3D1X_(RESOURCE_rendertarget)].num_handles, gdraw_limits[GDRAW_D3D1X_(RESOURCE_rendertarget)].num_bytes);
|
|
|
|
if (!gdraw->texturecache || !gdraw->vbufcache || !alloc_dynbuffer(gdraw_limits[GDRAW_D3D1X_(RESOURCE_dynbuffer)].num_bytes)) {
|
|
free_gdraw();
|
|
return NULL;
|
|
}
|
|
|
|
create_all_shaders_and_state();
|
|
|
|
gdraw_funcs.SetViewSizeAndWorldScale = gdraw_SetViewSizeAndWorldScale;
|
|
gdraw_funcs.GetInfo = gdraw_GetInfo;
|
|
|
|
gdraw_funcs.DescribeTexture = gdraw_DescribeTexture;
|
|
gdraw_funcs.DescribeVertexBuffer = gdraw_DescribeVertexBuffer;
|
|
|
|
gdraw_funcs.RenderingBegin = gdraw_RenderingBegin;
|
|
gdraw_funcs.RenderingEnd = gdraw_RenderingEnd;
|
|
gdraw_funcs.RenderTileBegin = gdraw_RenderTileBegin;
|
|
gdraw_funcs.RenderTileEnd = gdraw_RenderTileEnd;
|
|
|
|
gdraw_funcs.TextureDrawBufferBegin = gdraw_TextureDrawBufferBegin;
|
|
gdraw_funcs.TextureDrawBufferEnd = gdraw_TextureDrawBufferEnd;
|
|
|
|
gdraw_funcs.DrawIndexedTriangles = gdraw_DrawIndexedTriangles;
|
|
gdraw_funcs.FilterQuad = gdraw_FilterQuad;
|
|
|
|
gdraw_funcs.SetAntialiasTexture = gdraw_SetAntialiasTexture;
|
|
|
|
gdraw_funcs.ClearStencilBits = gdraw_ClearStencilBits;
|
|
gdraw_funcs.ClearID = gdraw_ClearID;
|
|
|
|
gdraw_funcs.MakeTextureBegin = gdraw_MakeTextureBegin;
|
|
gdraw_funcs.MakeTextureMore = gdraw_MakeTextureMore;
|
|
gdraw_funcs.MakeTextureEnd = gdraw_MakeTextureEnd;
|
|
|
|
gdraw_funcs.UpdateTextureBegin = gdraw_UpdateTextureBegin;
|
|
gdraw_funcs.UpdateTextureRect = gdraw_UpdateTextureRect;
|
|
gdraw_funcs.UpdateTextureEnd = gdraw_UpdateTextureEnd;
|
|
|
|
gdraw_funcs.FreeTexture = gdraw_FreeTexture;
|
|
gdraw_funcs.TryToLockTexture = gdraw_TryToLockTexture;
|
|
|
|
gdraw_funcs.MakeTextureFromResource = (gdraw_make_texture_from_resource *) gdraw_D3D1X_(MakeTextureFromResource);
|
|
gdraw_funcs.FreeTextureFromResource = gdraw_D3D1X_(DestroyTextureFromResource);
|
|
|
|
gdraw_funcs.MakeVertexBufferBegin = gdraw_MakeVertexBufferBegin;
|
|
gdraw_funcs.MakeVertexBufferMore = gdraw_MakeVertexBufferMore;
|
|
gdraw_funcs.MakeVertexBufferEnd = gdraw_MakeVertexBufferEnd;
|
|
gdraw_funcs.TryToLockVertexBuffer = gdraw_TryLockVertexBuffer;
|
|
gdraw_funcs.FreeVertexBuffer = gdraw_FreeVertexBuffer;
|
|
|
|
gdraw_funcs.UnlockHandles = gdraw_UnlockHandles;
|
|
gdraw_funcs.SetTextureUniqueID = gdraw_SetTextureUniqueID;
|
|
|
|
gdraw_funcs.Set3DTransform = gdraw_Set3DTransform;
|
|
|
|
return &gdraw_funcs;
|
|
}
|
|
|
|
void gdraw_D3D1X_(DestroyContext)(void)
|
|
{
|
|
if (gdraw && gdraw->d3d_device) {
|
|
GDrawStats stats={0};
|
|
clear_renderstate();
|
|
stencil_state_cache_clear();
|
|
destroy_all_shaders_and_state();
|
|
safe_release(gdraw->aa_tex);
|
|
safe_release(gdraw->aa_tex_view);
|
|
safe_release(gdraw->dyn_vb.buffer);
|
|
safe_release(gdraw->dyn_ib.buffer);
|
|
|
|
flush_rendertargets(&stats);
|
|
if (gdraw->texturecache) gdraw_res_flush(gdraw->texturecache, &stats);
|
|
if (gdraw->vbufcache) gdraw_res_flush(gdraw->vbufcache, &stats);
|
|
|
|
gdraw->d3d_device = NULL;
|
|
}
|
|
|
|
free_gdraw();
|
|
}
|
|
|
|
void gdraw_D3D1X_(SetErrorHandler)(void (__cdecl *error_handler)(HRESULT hr))
|
|
{
|
|
if (gdraw)
|
|
gdraw->error_handler = error_handler;
|
|
}
|
|
|
|
void gdraw_D3D1X_(PreReset)(void)
|
|
{
|
|
if (!gdraw) return;
|
|
|
|
GDrawStats stats={0};
|
|
flush_rendertargets(&stats);
|
|
|
|
// we may end up resizing the frame buffer
|
|
gdraw->frametex_width = 0;
|
|
gdraw->frametex_height = 0;
|
|
}
|
|
|
|
void gdraw_D3D1X_(PostReset)(void)
|
|
{
|
|
// maybe re-create rendertargets right now?
|
|
}
|
|
|
|
void RADLINK gdraw_D3D1X_(BeginCustomDraw)(IggyCustomDrawCallbackRegion * region, F32 mat[4][4])
|
|
{
|
|
clear_renderstate();
|
|
gdraw_GetObjectSpaceMatrix(mat[0], region->o2w, gdraw->projection, 0, 0);
|
|
}
|
|
|
|
void RADLINK gdraw_D3D1X_(BeginCustomDraw_4J)(IggyCustomDrawCallbackRegion * region, F32 mat[16])
|
|
{
|
|
clear_renderstate();
|
|
gdraw_GetObjectSpaceMatrix(mat, region->o2w, gdraw->projection, 0, 0);
|
|
}
|
|
|
|
void RADLINK gdraw_D3D1X_(CalculateCustomDraw_4J)(IggyCustomDrawCallbackRegion * region, F32 mat[16])
|
|
{
|
|
gdraw_GetObjectSpaceMatrix(mat, region->o2w, gdraw->projection, 0, 0);
|
|
}
|
|
|
|
void RADLINK gdraw_D3D1X_(EndCustomDraw)(IggyCustomDrawCallbackRegion * /*region*/)
|
|
{
|
|
GDrawStats stats={};
|
|
set_common_renderstate();
|
|
set_viewport();
|
|
set_render_target(&stats);
|
|
}
|
|
|
|
void RADLINK gdraw_D3D1X_(GetResourceUsageStats)(gdraw_resourcetype type, S32 *handles_used, S32 *bytes_used)
|
|
{
|
|
GDrawHandleCache *cache;
|
|
|
|
switch (type) {
|
|
case GDRAW_D3D1X_(RESOURCE_rendertarget): cache = &gdraw->rendertargets; break;
|
|
case GDRAW_D3D1X_(RESOURCE_texture): cache = gdraw->texturecache; break;
|
|
case GDRAW_D3D1X_(RESOURCE_vertexbuffer): cache = gdraw->vbufcache; break;
|
|
case GDRAW_D3D1X_(RESOURCE_dynbuffer): *handles_used = 0; *bytes_used = gdraw->last_dyn_maxalloc; return;
|
|
default: cache = NULL; break;
|
|
}
|
|
|
|
*handles_used = *bytes_used = 0;
|
|
|
|
if (cache) {
|
|
S32 i;
|
|
U64 frame = gdraw->frame_counter;
|
|
|
|
for (i=0; i < cache->max_handles; ++i)
|
|
if (cache->handle[i].bytes && cache->handle[i].owner && cache->handle[i].fence.value == frame) {
|
|
*handles_used += 1;
|
|
*bytes_used += cache->handle[i].bytes;
|
|
}
|
|
}
|
|
}
|
|
|
|
static S32 num_pixels(S32 w, S32 h, S32 mipmaps)
|
|
{
|
|
S32 k, pixels=0;
|
|
for (k=0; k < mipmaps; ++k) {
|
|
pixels += w*h*2;
|
|
w = (w>>1); w += !w;
|
|
h = (h>>1); h += !h;
|
|
}
|
|
return pixels;
|
|
}
|
|
|
|
GDrawTexture * RADLINK gdraw_D3D1X_(MakeTextureFromResource)(U8 *resource_file, S32 /*len*/, IggyFileTextureRaw *texture)
|
|
{
|
|
const char *failed_call="";
|
|
U8 *free_data = 0;
|
|
GDrawTexture *t=0;
|
|
S32 width, height, mipmaps, size, blk;
|
|
ID3D1X(Texture2D) *tex=0;
|
|
ID3D1X(ShaderResourceView) *view=0;
|
|
|
|
DXGI_FORMAT d3dfmt;
|
|
D3D1X_(SUBRESOURCE_DATA) mipdata[24] = { 0 };
|
|
S32 k;
|
|
|
|
HRESULT hr = S_OK;
|
|
|
|
width = texture->w;
|
|
height = texture->h;
|
|
mipmaps = texture->mipmaps;
|
|
blk = 1;
|
|
|
|
D3D1X_(TEXTURE2D_DESC) desc = { static_cast<U32>(width), static_cast<U32>(height), static_cast<U32>(mipmaps), 1U, DXGI_FORMAT_UNKNOWN, { 1, 0 },
|
|
D3D1X_(USAGE_IMMUTABLE), D3D1X_(BIND_SHADER_RESOURCE), 0U, 0U };
|
|
|
|
bool done = false;
|
|
|
|
switch (texture->format) {
|
|
case IFT_FORMAT_rgba_8888 : size= 4; d3dfmt = DXGI_FORMAT_R8G8B8A8_UNORM; break;
|
|
case IFT_FORMAT_DXT1 : size= 8; d3dfmt = DXGI_FORMAT_BC1_UNORM; blk = 4; break;
|
|
case IFT_FORMAT_DXT3 : size=16; d3dfmt = DXGI_FORMAT_BC2_UNORM; blk = 4; break;
|
|
case IFT_FORMAT_DXT5 : size=16; d3dfmt = DXGI_FORMAT_BC3_UNORM; blk = 4; break;
|
|
default: {
|
|
IggyGDrawSendWarning(NULL, "GDraw .iggytex raw texture format %d not supported by hardware", texture->format);
|
|
done = true;
|
|
}
|
|
}
|
|
|
|
if (!done) {
|
|
desc.Format = d3dfmt;
|
|
|
|
U8 *data = resource_file + texture->file_offset;
|
|
|
|
if (texture->format == IFT_FORMAT_i_8 || texture->format == IFT_FORMAT_i_4) {
|
|
// convert from intensity to luma+alpha
|
|
S32 i;
|
|
S32 total_size = 2 * num_pixels(width,height,mipmaps);
|
|
|
|
free_data = (U8 *) IggyGDrawMalloc(total_size);
|
|
if (!free_data) {
|
|
IggyGDrawSendWarning(NULL, "GDraw out of memory to store texture data to pass to D3D for %d x %d texture", width, height);
|
|
done = true;
|
|
} else {
|
|
U8 *cur = free_data;
|
|
|
|
for (k=0; k < mipmaps; ++k) {
|
|
S32 w = RR_MAX(width >> k, 1);
|
|
S32 h = RR_MAX(height >> k, 1);
|
|
for (i=0; i < w*h; ++i) {
|
|
cur[0] = cur[1] = *data++;
|
|
cur += 2;
|
|
}
|
|
}
|
|
data = free_data;
|
|
}
|
|
}
|
|
|
|
if (!done) {
|
|
for (k=0; k < mipmaps; ++k) {
|
|
S32 w = RR_MAX(width >> k, 1);
|
|
S32 h = RR_MAX(height >> k, 1);
|
|
S32 blkw = (w + blk-1) / blk;
|
|
S32 blkh = (h + blk-1) / blk;
|
|
|
|
mipdata[k].pSysMem = data;
|
|
mipdata[k].SysMemPitch = blkw * size;
|
|
data += blkw * blkh * size;
|
|
}
|
|
|
|
failed_call = "CreateTexture2D";
|
|
hr = gdraw->d3d_device->CreateTexture2D(&desc, mipdata, &tex);
|
|
if (!FAILED(hr)) {
|
|
failed_call = "CreateShaderResourceView for texture creation";
|
|
hr = gdraw->d3d_device->CreateShaderResourceView(tex, NULL, &view);
|
|
if (!FAILED(hr))
|
|
t = gdraw_D3D1X_(WrappedTextureCreate)(view);
|
|
}
|
|
}
|
|
}
|
|
|
|
if (FAILED(hr)) {
|
|
report_d3d_error(hr, failed_call, "");
|
|
}
|
|
|
|
if (free_data)
|
|
IggyGDrawFree(free_data);
|
|
|
|
if (!t) {
|
|
if (view)
|
|
view->Release();
|
|
if (tex)
|
|
tex->Release();
|
|
} else {
|
|
((GDrawHandle *) t)->handle.tex.d3d = tex;
|
|
}
|
|
return t;
|
|
}
|
|
|
|
void RADLINK gdraw_D3D1X_(DestroyTextureFromResource)(GDrawTexture *tex)
|
|
{
|
|
GDrawHandle *h = (GDrawHandle *) tex;
|
|
safe_release(h->handle.tex.d3d_view);
|
|
safe_release(h->handle.tex.d3d);
|
|
gdraw_D3D1X_(WrappedTextureDestroy)(tex);
|
|
}
|
|
|