504 Gateway Time-out<\/title><\/head> \n<body> \n<center><\/p>\n<div id=\"ez-toc-container\" class=\"ez-toc-v2_0_80 counter-hierarchy ez-toc-counter ez-toc-grey ez-toc-container-direction\">\nTable of Contents<\/p>\n<label for=\"ez-toc-cssicon-toggle-item-6a5f5adc3b7b7\" class=\"ez-toc-cssicon-toggle-label\">Toggle<\/span><svg style=\"fill: #999;color:#999\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" class=\"list-377408\" width=\"20px\" height=\"20px\" viewBox=\"0 0 24 24\" fill=\"none\"><path d=\"M6 6H4v2h2V6zm14 0H8v2h12V6zM4 11h2v2H4v-2zm16 0H8v2h12v-2zM4 16h2v2H4v-2zm16 0H8v2h12v-2z\" fill=\"currentColor\"><\/path><\/svg><svg style=\"fill: #999;color:#999\" class=\"arrow-unsorted-368013\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"10px\" height=\"10px\" viewBox=\"0 0 24 24\" version=\"1.2\" baseProfile=\"tiny\"><path d=\"M18.2 9.3l-6.2-6.3-6.2 6.3c-.2.2-.3.4-.3.7s.1.5.3.7c.2.2.4.3.7.3h11c.3 0 .5-.1.7-.3.2-.2.3-.5.3-.7s-.1-.5-.3-.7zM5.8 14.7l6.2 6.3 6.2-6.3c.2-.2.3-.5.3-.7s-.1-.5-.3-.7c-.2-.2-.4-.3-.7-.3h-11c-.3 0-.5.1-.7.3-.2.2-.3.5-.3.7s.1.5.3.7z\"\/><\/svg><\/span><\/span><\/label><input type=\"checkbox\" id=\"ez-toc-cssicon-toggle-item-6a5f5adc3b7b7\" aria-label=\"Toggle\" \/><nav><ul class='ez-toc-list ez-toc-list-level-1 ' ><li class='ez-toc-page-1 ez-toc-heading-level-1'><a class=\"ez-toc-link ez-toc-heading-1\" href=\"https:\/\/itsupportwale.com\/blog\/aws-ai-guide-build-and-scale-smarter-applications\/#504_Gateway_Time-out\" >504 Gateway Time-out<\/a><ul class='ez-toc-list-level-2' ><li class='ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-2\" href=\"https:\/\/itsupportwale.com\/blog\/aws-ai-guide-build-and-scale-smarter-applications\/#H2_Latency_is_Not_a_Suggestion\" >H2: Latency is Not a Suggestion<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-3\" href=\"https:\/\/itsupportwale.com\/blog\/aws-ai-guide-build-and-scale-smarter-applications\/#H2_The_Cost-Per-Token_Heart_Attack\" >H2: The Cost-Per-Token Heart Attack<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-4\" href=\"https:\/\/itsupportwale.com\/blog\/aws-ai-guide-build-and-scale-smarter-applications\/#H2_VPC_Endpoints_and_the_PrivateLink_Tax\" >H2: VPC Endpoints and the PrivateLink Tax<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-5\" href=\"https:\/\/itsupportwale.com\/blog\/aws-ai-guide-build-and-scale-smarter-applications\/#H2_Provisioned_Throughput_vs_On-Demand_Chaos\" >H2: Provisioned Throughput vs. On-Demand Chaos<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-6\" href=\"https:\/\/itsupportwale.com\/blog\/aws-ai-guide-build-and-scale-smarter-applications\/#H2_Lambda_Cold_Starts_and_the_Python_312_Runtime\" >H2: Lambda Cold Starts and the Python 3.12 Runtime<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-7\" href=\"https:\/\/itsupportwale.com\/blog\/aws-ai-guide-build-and-scale-smarter-applications\/#The_Agony_of_the_%E2%80%9CBlack_Box%E2%80%9D_Debugging\" >The Agony of the “Black Box” Debugging<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-8\" href=\"https:\/\/itsupportwale.com\/blog\/aws-ai-guide-build-and-scale-smarter-applications\/#Lessons_Learned_The_Hard_Way\" >Lessons Learned (The Hard Way)<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-9\" href=\"https:\/\/itsupportwale.com\/blog\/aws-ai-guide-build-and-scale-smarter-applications\/#Related_Articles\" >Related Articles<\/a><\/li><\/ul><\/li><\/ul><\/nav><\/div>\n<h1><\/span>504 Gateway Time-out<\/span><\/h1>\n<\/center> \n<\/body> \n<\/html><\/p>\n$ aws bedrock-runtime invoke-model \\ \n –model-id anthropic.claude-v2:1 \\ \n –body ‘{“prompt”: “\\n\\nHuman: Why is the stack failing?\\n\\nAssistant:”, “max_tokens_to_sample”: 300}’ \\ \n –region us-east-1 \\ \n output.txt<\/p>\nAn error occurred (ThrottlingException) when calling the InvokeModel operation (reached max retries: 4): Too many requests, please wait before retrying.<\/p>\n$ tail -f \/var\/log\/cloudwatch\/bedrock-integration-errors.log \n[2024-05-20T03:14:22Z] ERROR: Lambda runtime timed out after 29.002s. \n[2024-05-20T03:14:23Z] ERROR: botocore.exceptions.ConnectTimeoutError: Connect timeout on endpoint URL: “https:\/\/bedrock-runtime.us-east-1.amazonaws.com\/model\/anthropic.claude-v2:1\/invoke” \n[2024-05-20T03:14:25Z] FATAL: Upstream “aws ai” service unreachable via VPC Endpoint.<\/p>\n<pre class=\"codehilite\"><code>## The 3 AM Reality Check\n\nI\u2019ve been awake for 72 hours. My eyes feel like they\u2019ve been scrubbed with industrial-grade sandpaper, and my bloodstream is approximately 40% Monster Energy and 60% pure, unadulterated spite. If I see one more slide deck about "intelligent automation" or "self-healing infrastructure," I am going to throw my YubiKey into the nearest cooling fan.\n\nThe "visionaries" in the C-suite decided six months ago that our legacy heuristic-based monitoring wasn't "forward-looking" enough. They wanted "aws ai" integration. They wanted a black box that could predict outages before they happened. Well, congratulations, Greg. The black box didn't predict the outage; the black box *was* the outage. \n\nWe replaced a perfectly functional, if slightly noisy, Prometheus\/Grafana stack with a convoluted mess of Lambda functions, Bedrock calls, and "AI-driven" auto-scaling groups. When the traffic spiked on Friday night\u2014a standard end-of-quarter batch processing load\u2014the "aws ai" logic decided that the latency increase wasn't a resource bottleneck, but a "pattern shift." It started spinning up instances like a caffeinated squirrel, which triggered a cascading failure in our IAM evaluation logic and hit the service quotas for Bedrock faster than you can say "over-engineered."\n\nI\u2019m writing this because the post-mortem is due in four hours, and if I don't vent this into an IRC channel of people who actually know what a subnet mask is, I\u2019m going to quit and go farm goats in the mountains.\n\n## H2: The IAM Policy from Hell\n\nLet\u2019s talk about the "aws ai" permission model. You\u2019d think that granting a Lambda function access to invoke a model would be a simple `Allow` on `bedrock:InvokeModel`. But no. Because we\u2019re using Provisioned Throughput (which we had to buy because the On-Demand limits are a joke), the IAM requirements mutated into a multi-headed hydra.\n\nWe spent four hours just trying to figure out why the production role, which worked in the `dev` account, was throwing 403s in `prod`. It turns out that if you\u2019re using a VPC endpoint for "aws ai" services, the endpoint policy *also* needs to explicitly allow the action, even if the identity-based policy is wide open. \n\nHere is the JSON block that cost me six hours of my life because the documentation for `boto3 v1.34.82` didn't mention the specific resource ARN format for provisioned models:\n\n```json\n{\n "Version": "2012-10-17",\n "Statement": [\n {\n "Sid": "BedrockScopedAccess",\n "Effect": "Allow",\n "Action": [\n "bedrock:InvokeModel",\n "bedrock:InvokeModelWithResponseStream"\n ],\n "Resource": [\n "arn:aws:bedrock:us-east-1:123456789012:provisioned-model\/5x9p2q7r4s1t",\n "arn:aws:bedrock:us-east-1::foundation-model\/anthropic.claude-v2:1"\n ]\n },\n {\n "Sid": "VPCPEndpointPolicy",\n "Effect": "Allow",\n "Principal": "*",\n "Action": "bedrock:InvokeModel",\n "Resource": "*",\n "Condition": {\n "StringEquals": {\n "aws:SourceVpce": "vpce-0a1b2c3d4e5f6g7h8"\n }\n }\n }\n ]\n}\n<\/code><\/pre>\nThe kicker? The <code>Resource<\/code> ARN for the provisioned model doesn’t follow the same pattern as the foundation model. If you miss one character, the “aws ai” SDK just returns a generic <code>AccessDeniedException<\/code> with zero hint about whether it\u2019s the IAM role, the KMS key (oh yeah, we had to encrypt the inputs), or the VPC endpoint policy. We were flying blind in a storm of our own making.<\/p>\n<h2><\/span>H2: Latency is Not a Suggestion<\/span><\/h2>\nThe “aws ai” advocates love to talk about “near-instantaneous insights.” In reality, calling <code>anthropic.claude-v2:1<\/code> via a Lambda function running <code>Python 3.12.1<\/code> is about as fast as a snail crawling through molasses. <\/p>\nWe were seeing cold starts on the Lambda side of about 800ms, which is fine, whatever. But the actual <code>invoke_model<\/code> call? Even with Provisioned Throughput, we were hitting 2.5 to 5 seconds for simple inference. Our API Gateway has a hard 29-second timeout. When the “aws ai” logic started getting bogged down by large context windows (because the “intelligent” agent decided it needed to read the last 500 lines of syslog for every request), the entire request chain backed up.<\/p>\nThe “aws ai” integration essentially turned our high-throughput event bus into a sequential queue. One slow inference call held up the worker, which held up the SQS consumer, which eventually caused the SQS queue to hit the 14-day retention limit because we couldn’t process messages fast enough. We were paying for “intelligence” and getting a lobotomized turtle in return.<\/p>\n<h2><\/span>H2: The Cost-Per-Token Heart Attack<\/span><\/h2>\nWhile I was trying to fix the 504 errors, I took a look at the Billing Dashboard. I nearly vomited. <\/p>\nThe “aws ai” services charge by the token. Our “visionary” implementation didn’t have any token-limiting logic in the prompt templates. The system was feeding entire JSON blobs into the model. Every time a developer pushed a debug log, the “aws ai” would ingest it, process it, and spit out a “summary” that cost us $0.05. Multiply that by 10,000 events per minute during the surge.<\/p>\nI found a CloudWatch log snippet that showed exactly how we were burning money:<\/p>\n<pre class=\"codehilite\"><code class=\"language-json\">{\n "timestamp": "2024-05-20T04:20:00Z",\n "level": "INFO",\n "message": "Model invocation successful",\n "model_id": "anthropic.claude-v2:1",\n "usage": {\n "input_tokens": 4502,\n "output_tokens": 128,\n "total_tokens": 4630\n },\n "billing_estimate_usd": 0.078,\n "request_id": "req-99-problems-and-ai-is-all-of-them"\n}\n<\/code><\/pre>\nEight cents. For one log line. We were literally burning the company\u2019s runway to have an LLM tell us that “The system is experiencing high load,” which I already knew because my pager was vibrating off the nightstand. The “aws ai” cost-per-token model is a predatory tax on companies that don’t have the sense to use a <code>grep<\/code> command.<\/p>\n<h2><\/span>H2: VPC Endpoints and the PrivateLink Tax<\/span><\/h2>\nBecause we\u2019re in a “highly regulated industry,” we can\u2019t just let our traffic traverse the public internet. We have to use PrivateLink. Setting up the VPC endpoint for “aws ai” (Bedrock) was a nightmare of DNS resolution issues.<\/p>\nWe\u2019re using <code>AWS CLI v2.15.30<\/code>. When you run a command inside the VPC, it should<\/em> resolve to the private IP of the endpoint. But because of a misconfiguration in our DHCP options set\u2014which had been there for years but never mattered until now\u2014the “aws ai” SDK kept trying to hit the public endpoint. <\/p>\nSince we had no NAT Gateway in that specific private subnet (to save costs, ironically), the requests just hung until they timed out. <\/p>\n<pre class=\"codehilite\"><code class=\"language-bash\"># The command that failed for 3 hours\n$ aws bedrock-runtime invoke-model --endpoint-url https:\/\/vpce-0a1b2c3d4e5f6g7h8-xyz.bedrock-runtime.us-east-1.vpce.amazonaws.com ...\n<\/code><\/pre>\nEven after we fixed the DNS, we realized that the VPC endpoint for “aws ai” doesn’t support cross-region requests. Our failover stack in <code>us-west-2<\/code> couldn’t talk to the Bedrock models in <code>us-east-1<\/code> without a complex VPC peering setup that our network team (of one person, who is on vacation) hadn’t approved. So, the “self-healing” infrastructure was actually a “self-destructing” infrastructure if a single region had a hiccup.<\/p>\n<h2><\/span>H2: Provisioned Throughput vs. On-Demand Chaos<\/span><\/h2>\nThe “aws ai” marketing says you can start small with On-Demand and scale up. That is a lie. On-Demand limits for Bedrock are so low they\u2019re practically decorative. We hit the <code>ThrottlingException<\/code> within the first ten minutes of the traffic spike.<\/p>\nSo, we switched to Provisioned Throughput. Do you know how much that costs? You have to commit to a “Model Unit” for either 1 month or 6 months. It\u2019s like buying a mainframe in the 70s just to run a calculator. And you can’t just “scale it up” instantly. Provisioning a new unit takes time. <\/p>\nDuring the 72-hour hell-march, I had to explain to the CFO why we were committing to a $20,000-a-month spend just to get the “aws ai” to stop throwing 429 errors. <\/p>\nThe “aws ai” scaling isn’t elastic; it’s brittle. It\u2019s a glass skyscraper in an earthquake zone. When the load hit, the On-Demand side choked, and the Provisioned side wasn’t large enough to handle the overflow. We were stuck in a “dead zone” where we couldn’t process the backlog, and we couldn’t scale the “intelligence” fast enough to clear it.<\/p>\n<h2><\/span>H2: Lambda Cold Starts and the Python 3.12 Runtime<\/span><\/h2>\nWe decided to use the latest and greatest: <code>Python 3.12.1<\/code>. Surely, the performance improvements would help with the “aws ai” overhead? <\/p>\nWrong. The <code>boto3<\/code> and <code>botocore<\/code> versions required to support the latest “aws ai” features are massive. By the time we bundled the dependencies into a Lambda layer, we were pushing the 250MB unzipped limit. This led to atrocious cold start times. <\/p>\nEvery time the “AI-driven” auto-scaler decided to spin up more Lambda executors, we\u2019d see a spike in 504 errors because the first few requests would time out while the container was still initializing the “aws ai” client.<\/p>\nI spent four hours stripping out unnecessary sub-modules from <code>botocore<\/code> just to get the package size down. I shouldn’t be doing tree-shaking on a Python library at 4 AM just because the “aws ai” SDK is bloated with every single AWS service definition since 2006.<\/p>\nHere\u2019s the snippet of the <code>serverless.yml<\/code> that I eventually had to hack together just to keep the cold starts under control:<\/p>\n<pre class=\"codehilite\"><code class=\"language-yaml\">functions:\n ai-analyzer:\n handler: handler.analyze\n runtime: python3.12\n memorySize: 3008 # Over-provisioning RAM just to get more CPU for faster imports\n timeout: 30\n environment:\n BOTO_CONFIG: \/var\/task\/boto_config\n layers:\n - arn:aws:lambda:us-east-1:123456789012:layer:aws-ai-optimized-sdk:1\n<\/code><\/pre>\nEven with 3GB of RAM, the “aws ai” initialization was sluggish. It\u2019s a fundamental mismatch: Lambda is meant for short-lived, fast-executing tasks. “aws ai” is a heavy, high-latency, state-heavy beast. Putting them together is like trying to put a jet engine on a tricycle.<\/p>\n<h2><\/span>The Agony of the “Black Box” Debugging<\/span><\/h2>\nThe worst part of this entire 72-hour ordeal wasn’t the technical hurdles. It was the lack of visibility. When a standard database query fails, I can look at the execution plan. I can see the locks. I can see the disk I\/O. <\/p>\nWhen an “aws ai” call fails or returns garbage, I have nothing. I have a <code>request_id<\/code> and a prayer. I spent two hours trying to figure out why the model was suddenly returning empty strings. Was it a prompt injection? Was it a safety filter? Was the model having a stroke? <\/p>\nThe “aws ai” logs don’t tell you why<\/em> a model refused to answer. They just give you a <code>finish_reason: \"content_filter\"<\/code>. Which content? Why? No one knows. It\u2019s “proprietary.” <\/p>\nSo there I was, the “Site Reliability Engineer,” responsible for the reliability of a site that depended on a component I couldn’t monitor, couldn’t tune, and couldn’t understand. I was just a glorified plumber trying to fix a leak in a pipe made of “magic.”<\/p>\n<h2><\/span>Lessons Learned (The Hard Way)<\/span><\/h2>\n<ol>\n<li>Stop using “aws ai” for critical path logic.<\/strong> If your system can’t boot without an LLM’s permission, your system is broken by design.<\/li>\n<li>Buy more RAM and run local models.<\/strong> A quantized Llama-3 model running on a beefy EC2 instance with an NVIDIA GPU is more predictable, faster, and cheaper than the “aws ai” token-based circus.<\/li>\n<li>Grep is your friend.<\/strong> You don’t need a multi-billion parameter model to find an <code>ERROR<\/code> string in a log file. Stop over-complicating things.<\/li>\n<li>VPC Endpoints are a hidden tax.<\/strong> Factor in the PrivateLink costs and the DNS headache before you commit to “secure” AI.<\/li>\n<li>Provisioned Throughput is a trap.<\/strong> It\u2019s just a way to lock you into a high monthly spend for a service that should be elastic.<\/li>\n<li>Documentation is a suggestion.<\/strong> The real documentation is in the <code>botocore<\/code> source code on GitHub. Read it, because the AWS docs won’t save you at 3 AM.<\/li>\n<\/ol>\nI\u2019m going to sleep now. If the “aws ai” decides to hallucinate another outage, tell it to fix it itself. I\u2019m out.<\/p>\n“`bash \n$ history -c \n$ logout \n$ exit 1<\/p>\n<h2><\/span>Related Articles<\/span><\/h2>\nExplore more insights and best practices:<\/p>\n<ul>\n<li><a href=\"https:\/\/itsupportwale.com\/blog\/power-saving-dark-mode-in-whatsapp\/\">Power Saving Dark Mode In Whatsapp<\/a><\/li>\n<li><a href=\"https:\/\/itsupportwale.com\/blog\/enable-rcs-message-in-any-android-phone\/\">Enable Rcs Message In Any Android Phone<\/a><\/li>\n<li><a href=\"https:\/\/itsupportwale.com\/blog\/10-essential-javascript-libraries-for-modern-web-dev\/\">10 Essential Javascript Libraries For Modern Web Dev<\/a><\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"$ curl -v -X POST https:\/\/api.internal.production.vortex\/v1\/inference \\ -H “Content-Type: application\/json” \\ -d ‘{“prompt”: “Analyze system logs for anomaly detection”, “max_tokens”: 512}’ Connected to api.internal.production.vortex (10.0.42.11) port 443 (#0) POST \/v1\/inference HTTP\/1.1 Host: api.internal.production.vortex User-Agent: curl\/8.5.0 Accept: \/ Content-Type: application\/json Content-Length: 72 < HTTP\/1.1 504 Gateway Timeout < Content-Type: text\/html < Content-Length: 160 < Connection: keep-alive … <a title=\"AWS AI Guide: Build and Scale Smarter Applications\" class=\"read-more\" href=\"https:\/\/itsupportwale.com\/blog\/aws-ai-guide-build-and-scale-smarter-applications\/\" aria-label=\"Read more on AWS AI Guide: Build and Scale Smarter Applications\">Read more<\/a><\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"class_list":["post-4806","post","type-post","status-publish","format-standard","hentry","category-uncategorized"],"yoast_head":"\n<title>AWS AI Guide: Build and Scale Smarter Applications - ITSupportWale<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/itsupportwale.com\/blog\/aws-ai-guide-build-and-scale-smarter-applications\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"AWS AI Guide: Build and Scale Smarter Applications - ITSupportWale\" \/>\n<meta property=\"og:description\" content=\"$ curl -v -X POST https:\/\/api.internal.production.vortex\/v1\/inference -H “Content-Type: application\/json” -d ‘{“prompt”: “Analyze system logs for anomaly detection”, “max_tokens”: 512}’ Connected to api.internal.production.vortex (10.0.42.11) port 443 (#0) POST \/v1\/inference HTTP\/1.1 Host: api.internal.production.vortex User-Agent: curl\/8.5.0 Accept: \/ Content-Type: application\/json Content-Length: 72 < HTTP\/1.1 504 Gateway Timeout < Content-Type: text\/html < Content-Length: 160 < Connection: keep-alive ... Read more\" \/>\n<meta property=\"og:url\" content=\"https:\/\/itsupportwale.com\/blog\/aws-ai-guide-build-and-scale-smarter-applications\/\" \/>\n<meta property=\"og:site_name\" content=\"ITSupportWale\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/Itsupportwale-298547177495978\" \/>\n<meta property=\"article:published_time\" content=\"2026-06-03T18:53:16+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/itsupportwale.com\/blog\/wp-content\/uploads\/2021\/05\/android-chrome-512x512-1.png\" \/>\n\t<meta property=\"og:image:width\" content=\"512\" \/>\n\t<meta property=\"og:image:height\" content=\"512\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/png\" \/>\n<meta name=\"author\" content=\"Techie\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Techie\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"11 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/itsupportwale.com\/blog\/aws-ai-guide-build-and-scale-smarter-applications\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/itsupportwale.com\/blog\/aws-ai-guide-build-and-scale-smarter-applications\/\"},\"author\":{\"name\":\"Techie\",\"@id\":\"https:\/\/itsupportwale.com\/blog\/#\/schema\/person\/8c5a2b3d36396e0a8fd91ec8242fd46d\"},\"headline\":\"AWS AI Guide: Build and Scale Smarter Applications\",\"datePublished\":\"2026-06-03T18:53:16+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/itsupportwale.com\/blog\/aws-ai-guide-build-and-scale-smarter-applications\/\"},\"wordCount\":1561,\"commentCount\":0,\"publisher\":{\"@id\":\"https:\/\/itsupportwale.com\/blog\/#organization\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\/\/itsupportwale.com\/blog\/aws-ai-guide-build-and-scale-smarter-applications\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/itsupportwale.com\/blog\/aws-ai-guide-build-and-scale-smarter-applications\/\",\"url\":\"https:\/\/itsupportwale.com\/blog\/aws-ai-guide-build-and-scale-smarter-applications\/\",\"name\":\"AWS AI Guide: Build and Scale Smarter Applications - ITSupportWale\",\"isPartOf\":{\"@id\":\"https:\/\/itsupportwale.com\/blog\/#website\"},\"datePublished\":\"2026-06-03T18:53:16+00:00\",\"breadcrumb\":{\"@id\":\"https:\/\/itsupportwale.com\/blog\/aws-ai-guide-build-and-scale-smarter-applications\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/itsupportwale.com\/blog\/aws-ai-guide-build-and-scale-smarter-applications\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/itsupportwale.com\/blog\/aws-ai-guide-build-and-scale-smarter-applications\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/itsupportwale.com\/blog\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"AWS AI Guide: Build and Scale Smarter Applications\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/itsupportwale.com\/blog\/#website\",\"url\":\"https:\/\/itsupportwale.com\/blog\/\",\"name\":\"ITSupportWale\",\"description\":\"Tips, Tricks, Fixed-Errors, Tutorials & Guides\",\"publisher\":{\"@id\":\"https:\/\/itsupportwale.com\/blog\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/itsupportwale.com\/blog\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/itsupportwale.com\/blog\/#organization\",\"name\":\"itsupportwale\",\"url\":\"https:\/\/itsupportwale.com\/blog\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/itsupportwale.com\/blog\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/itsupportwale.com\/blog\/wp-content\/uploads\/2023\/09\/cropped-Logo-trans-without-slogan.png\",\"contentUrl\":\"https:\/\/itsupportwale.com\/blog\/wp-content\/uploads\/2023\/09\/cropped-Logo-trans-without-slogan.png\",\"width\":1119,\"height\":144,\"caption\":\"itsupportwale\"},\"image\":{\"@id\":\"https:\/\/itsupportwale.com\/blog\/#\/schema\/logo\/image\/\"},\"sameAs\":[\"https:\/\/www.facebook.com\/Itsupportwale-298547177495978\"]},{\"@type\":\"Person\",\"@id\":\"https:\/\/itsupportwale.com\/blog\/#\/schema\/person\/8c5a2b3d36396e0a8fd91ec8242fd46d\",\"name\":\"Techie\",\"sameAs\":[\"https:\/\/itsupportwale.com\",\"iswblogadmin\"],\"url\":\"https:\/\/itsupportwale.com\/blog\/author\/iswblogadmin\/\"}]}<\/script>\n","yoast_head_json":{"title":"AWS AI Guide: Build and Scale Smarter Applications - ITSupportWale","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/itsupportwale.com\/blog\/aws-ai-guide-build-and-scale-smarter-applications\/","og_locale":"en_US","og_type":"article","og_title":"AWS AI Guide: Build and Scale Smarter Applications - ITSupportWale","og_description":"$ curl -v -X POST https:\/\/api.internal.production.vortex\/v1\/inference -H “Content-Type: application\/json” -d ‘{“prompt”: “Analyze system logs for anomaly detection”, “max_tokens”: 512}’ Connected to api.internal.production.vortex (10.0.42.11) port 443 (#0) POST \/v1\/inference HTTP\/1.1 Host: api.internal.production.vortex User-Agent: curl\/8.5.0 Accept: \/ Content-Type: application\/json Content-Length: 72 < HTTP\/1.1 504 Gateway Timeout < Content-Type: text\/html < Content-Length: 160 < Connection: keep-alive ... Read more","og_url":"https:\/\/itsupportwale.com\/blog\/aws-ai-guide-build-and-scale-smarter-applications\/","og_site_name":"ITSupportWale","article_publisher":"https:\/\/www.facebook.com\/Itsupportwale-298547177495978","article_published_time":"2026-06-03T18:53:16+00:00","og_image":[{"width":512,"height":512,"url":"https:\/\/itsupportwale.com\/blog\/wp-content\/uploads\/2021\/05\/android-chrome-512x512-1.png","type":"image\/png"}],"author":"Techie","twitter_card":"summary_large_image","twitter_misc":{"Written by":"Techie","Est. reading time":"11 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/itsupportwale.com\/blog\/aws-ai-guide-build-and-scale-smarter-applications\/#article","isPartOf":{"@id":"https:\/\/itsupportwale.com\/blog\/aws-ai-guide-build-and-scale-smarter-applications\/"},"author":{"name":"Techie","@id":"https:\/\/itsupportwale.com\/blog\/#\/schema\/person\/8c5a2b3d36396e0a8fd91ec8242fd46d"},"headline":"AWS AI Guide: Build and Scale Smarter Applications","datePublished":"2026-06-03T18:53:16+00:00","mainEntityOfPage":{"@id":"https:\/\/itsupportwale.com\/blog\/aws-ai-guide-build-and-scale-smarter-applications\/"},"wordCount":1561,"commentCount":0,"publisher":{"@id":"https:\/\/itsupportwale.com\/blog\/#organization"},"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/itsupportwale.com\/blog\/aws-ai-guide-build-and-scale-smarter-applications\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/itsupportwale.com\/blog\/aws-ai-guide-build-and-scale-smarter-applications\/","url":"https:\/\/itsupportwale.com\/blog\/aws-ai-guide-build-and-scale-smarter-applications\/","name":"AWS AI Guide: Build and Scale Smarter Applications - ITSupportWale","isPartOf":{"@id":"https:\/\/itsupportwale.com\/blog\/#website"},"datePublished":"2026-06-03T18:53:16+00:00","breadcrumb":{"@id":"https:\/\/itsupportwale.com\/blog\/aws-ai-guide-build-and-scale-smarter-applications\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/itsupportwale.com\/blog\/aws-ai-guide-build-and-scale-smarter-applications\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/itsupportwale.com\/blog\/aws-ai-guide-build-and-scale-smarter-applications\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/itsupportwale.com\/blog\/"},{"@type":"ListItem","position":2,"name":"AWS AI Guide: Build and Scale Smarter Applications"}]},{"@type":"WebSite","@id":"https:\/\/itsupportwale.com\/blog\/#website","url":"https:\/\/itsupportwale.com\/blog\/","name":"ITSupportWale","description":"Tips, Tricks, Fixed-Errors, Tutorials & Guides","publisher":{"@id":"https:\/\/itsupportwale.com\/blog\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/itsupportwale.com\/blog\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/itsupportwale.com\/blog\/#organization","name":"itsupportwale","url":"https:\/\/itsupportwale.com\/blog\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/itsupportwale.com\/blog\/#\/schema\/logo\/image\/","url":"https:\/\/itsupportwale.com\/blog\/wp-content\/uploads\/2023\/09\/cropped-Logo-trans-without-slogan.png","contentUrl":"https:\/\/itsupportwale.com\/blog\/wp-content\/uploads\/2023\/09\/cropped-Logo-trans-without-slogan.png","width":1119,"height":144,"caption":"itsupportwale"},"image":{"@id":"https:\/\/itsupportwale.com\/blog\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/Itsupportwale-298547177495978"]},{"@type":"Person","@id":"https:\/\/itsupportwale.com\/blog\/#\/schema\/person\/8c5a2b3d36396e0a8fd91ec8242fd46d","name":"Techie","sameAs":["https:\/\/itsupportwale.com","iswblogadmin"],"url":"https:\/\/itsupportwale.com\/blog\/author\/iswblogadmin\/"}]}},"_links":{"self":[{"href":"https:\/\/itsupportwale.com\/blog\/wp-json\/wp\/v2\/posts\/4806","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/itsupportwale.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/itsupportwale.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/itsupportwale.com\/blog\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/itsupportwale.com\/blog\/wp-json\/wp\/v2\/comments?post=4806"}],"version-history":[{"count":0,"href":"https:\/\/itsupportwale.com\/blog\/wp-json\/wp\/v2\/posts\/4806\/revisions"}],"wp:attachment":[{"href":"https:\/\/itsupportwale.com\/blog\/wp-json\/wp\/v2\/media?parent=4806"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/itsupportwale.com\/blog\/wp-json\/wp\/v2\/categories?post=4806"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/itsupportwale.com\/blog\/wp-json\/wp\/v2\/tags?post=4806"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}

{"id":4806,"date":"2026-06-04T00:23:16","date_gmt":"2026-06-03T18:53:16","guid":{"rendered":"https:\/\/itsupportwale.com\/blog\/aws-ai-guide-build-and-scale-smarter-applications\/"},"modified":"2026-06-04T00:23:16","modified_gmt":"2026-06-03T18:53:16","slug":"aws-ai-guide-build-and-scale-smarter-applications","status":"publish","type":"post","link":"https:\/\/itsupportwale.com\/blog\/aws-ai-guide-build-and-scale-smarter-applications\/","title":{"rendered":"AWS AI Guide: Build and Scale Smarter Applications"},"content":{"rendered":"

$ curl -v -X POST https:\/\/api.internal.production.vortex\/v1\/inference \\
\n -H “Content-Type: application\/json” \\
\n -d ‘{“prompt”: “Analyze system logs for anomaly detection”, “max_tokens”: 512}’<\/p>\n

Connected to api.internal.production.vortex (10.0.42.11) port 443 (#0)
\n
\n
POST \/v1\/inference HTTP\/1.1
\nHost: api.internal.production.vortex
\nUser-Agent: curl\/8.5.0
\nAccept: \/<\/em>
\nContent-Type: application\/json
\nContent-Length: 72<\/p>\n
< HTTP\/1.1 504 Gateway Timeout
\n< Content-Type: text\/html
\n< Content-Length: 160
\n< Connection: keep-alive<\/p>\n<\/blockquote>\n<\/li>\n<\/ul>\n

\n504 Gateway Time-out<\/title><\/head> \n<body> \n<center><\/p>\n<div id=\"ez-toc-container\" class=\"ez-toc-v2_0_80 counter-hierarchy ez-toc-counter ez-toc-grey ez-toc-container-direction\">\nTable of Contents<\/p>\n<label for=\"ez-toc-cssicon-toggle-item-6a5f5adc3b7b7\" class=\"ez-toc-cssicon-toggle-label\">Toggle<\/span><svg style=\"fill: #999;color:#999\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" class=\"list-377408\" width=\"20px\" height=\"20px\" viewBox=\"0 0 24 24\" fill=\"none\"><path d=\"M6 6H4v2h2V6zm14 0H8v2h12V6zM4 11h2v2H4v-2zm16 0H8v2h12v-2zM4 16h2v2H4v-2zm16 0H8v2h12v-2z\" fill=\"currentColor\"><\/path><\/svg><svg style=\"fill: #999;color:#999\" class=\"arrow-unsorted-368013\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"10px\" height=\"10px\" viewBox=\"0 0 24 24\" version=\"1.2\" baseProfile=\"tiny\"><path d=\"M18.2 9.3l-6.2-6.3-6.2 6.3c-.2.2-.3.4-.3.7s.1.5.3.7c.2.2.4.3.7.3h11c.3 0 .5-.1.7-.3.2-.2.3-.5.3-.7s-.1-.5-.3-.7zM5.8 14.7l6.2 6.3 6.2-6.3c.2-.2.3-.5.3-.7s-.1-.5-.3-.7c-.2-.2-.4-.3-.7-.3h-11c-.3 0-.5.1-.7.3-.2.2-.3.5-.3.7s.1.5.3.7z\"\/><\/svg><\/span><\/span><\/label><input type=\"checkbox\" id=\"ez-toc-cssicon-toggle-item-6a5f5adc3b7b7\" aria-label=\"Toggle\" \/><nav><ul class='ez-toc-list ez-toc-list-level-1 ' ><li class='ez-toc-page-1 ez-toc-heading-level-1'><a class=\"ez-toc-link ez-toc-heading-1\" href=\"https:\/\/itsupportwale.com\/blog\/aws-ai-guide-build-and-scale-smarter-applications\/#504_Gateway_Time-out\" >504 Gateway Time-out<\/a><ul class='ez-toc-list-level-2' ><li class='ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-2\" href=\"https:\/\/itsupportwale.com\/blog\/aws-ai-guide-build-and-scale-smarter-applications\/#H2_Latency_is_Not_a_Suggestion\" >H2: Latency is Not a Suggestion<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-3\" href=\"https:\/\/itsupportwale.com\/blog\/aws-ai-guide-build-and-scale-smarter-applications\/#H2_The_Cost-Per-Token_Heart_Attack\" >H2: The Cost-Per-Token Heart Attack<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-4\" href=\"https:\/\/itsupportwale.com\/blog\/aws-ai-guide-build-and-scale-smarter-applications\/#H2_VPC_Endpoints_and_the_PrivateLink_Tax\" >H2: VPC Endpoints and the PrivateLink Tax<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-5\" href=\"https:\/\/itsupportwale.com\/blog\/aws-ai-guide-build-and-scale-smarter-applications\/#H2_Provisioned_Throughput_vs_On-Demand_Chaos\" >H2: Provisioned Throughput vs. On-Demand Chaos<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-6\" href=\"https:\/\/itsupportwale.com\/blog\/aws-ai-guide-build-and-scale-smarter-applications\/#H2_Lambda_Cold_Starts_and_the_Python_312_Runtime\" >H2: Lambda Cold Starts and the Python 3.12 Runtime<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-7\" href=\"https:\/\/itsupportwale.com\/blog\/aws-ai-guide-build-and-scale-smarter-applications\/#The_Agony_of_the_%E2%80%9CBlack_Box%E2%80%9D_Debugging\" >The Agony of the “Black Box” Debugging<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-8\" href=\"https:\/\/itsupportwale.com\/blog\/aws-ai-guide-build-and-scale-smarter-applications\/#Lessons_Learned_The_Hard_Way\" >Lessons Learned (The Hard Way)<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-9\" href=\"https:\/\/itsupportwale.com\/blog\/aws-ai-guide-build-and-scale-smarter-applications\/#Related_Articles\" >Related Articles<\/a><\/li><\/ul><\/li><\/ul><\/nav><\/div>\n<h1><\/span>504 Gateway Time-out<\/span><\/h1>\n<\/center> \n<\/body> \n<\/html><\/p>\n$ aws bedrock-runtime invoke-model \\ \n –model-id anthropic.claude-v2:1 \\ \n –body ‘{“prompt”: “\\n\\nHuman: Why is the stack failing?\\n\\nAssistant:”, “max_tokens_to_sample”: 300}’ \\ \n –region us-east-1 \\ \n output.txt<\/p>\nAn error occurred (ThrottlingException) when calling the InvokeModel operation (reached max retries: 4): Too many requests, please wait before retrying.<\/p>\n$ tail -f \/var\/log\/cloudwatch\/bedrock-integration-errors.log \n[2024-05-20T03:14:22Z] ERROR: Lambda runtime timed out after 29.002s. \n[2024-05-20T03:14:23Z] ERROR: botocore.exceptions.ConnectTimeoutError: Connect timeout on endpoint URL: “https:\/\/bedrock-runtime.us-east-1.amazonaws.com\/model\/anthropic.claude-v2:1\/invoke” \n[2024-05-20T03:14:25Z] FATAL: Upstream “aws ai” service unreachable via VPC Endpoint.<\/p>\n<pre class=\"codehilite\"><code>## The 3 AM Reality Check\n\nI\u2019ve been awake for 72 hours. My eyes feel like they\u2019ve been scrubbed with industrial-grade sandpaper, and my bloodstream is approximately 40% Monster Energy and 60% pure, unadulterated spite. If I see one more slide deck about "intelligent automation" or "self-healing infrastructure," I am going to throw my YubiKey into the nearest cooling fan.\n\nThe "visionaries" in the C-suite decided six months ago that our legacy heuristic-based monitoring wasn't "forward-looking" enough. They wanted "aws ai" integration. They wanted a black box that could predict outages before they happened. Well, congratulations, Greg. The black box didn't predict the outage; the black box *was* the outage. \n\nWe replaced a perfectly functional, if slightly noisy, Prometheus\/Grafana stack with a convoluted mess of Lambda functions, Bedrock calls, and "AI-driven" auto-scaling groups. When the traffic spiked on Friday night\u2014a standard end-of-quarter batch processing load\u2014the "aws ai" logic decided that the latency increase wasn't a resource bottleneck, but a "pattern shift." It started spinning up instances like a caffeinated squirrel, which triggered a cascading failure in our IAM evaluation logic and hit the service quotas for Bedrock faster than you can say "over-engineered."\n\nI\u2019m writing this because the post-mortem is due in four hours, and if I don't vent this into an IRC channel of people who actually know what a subnet mask is, I\u2019m going to quit and go farm goats in the mountains.\n\n## H2: The IAM Policy from Hell\n\nLet\u2019s talk about the "aws ai" permission model. You\u2019d think that granting a Lambda function access to invoke a model would be a simple `Allow` on `bedrock:InvokeModel`. But no. Because we\u2019re using Provisioned Throughput (which we had to buy because the On-Demand limits are a joke), the IAM requirements mutated into a multi-headed hydra.\n\nWe spent four hours just trying to figure out why the production role, which worked in the `dev` account, was throwing 403s in `prod`. It turns out that if you\u2019re using a VPC endpoint for "aws ai" services, the endpoint policy *also* needs to explicitly allow the action, even if the identity-based policy is wide open. \n\nHere is the JSON block that cost me six hours of my life because the documentation for `boto3 v1.34.82` didn't mention the specific resource ARN format for provisioned models:\n\n```json\n{\n "Version": "2012-10-17",\n "Statement": [\n {\n "Sid": "BedrockScopedAccess",\n "Effect": "Allow",\n "Action": [\n "bedrock:InvokeModel",\n "bedrock:InvokeModelWithResponseStream"\n ],\n "Resource": [\n "arn:aws:bedrock:us-east-1:123456789012:provisioned-model\/5x9p2q7r4s1t",\n "arn:aws:bedrock:us-east-1::foundation-model\/anthropic.claude-v2:1"\n ]\n },\n {\n "Sid": "VPCPEndpointPolicy",\n "Effect": "Allow",\n "Principal": "*",\n "Action": "bedrock:InvokeModel",\n "Resource": "*",\n "Condition": {\n "StringEquals": {\n "aws:SourceVpce": "vpce-0a1b2c3d4e5f6g7h8"\n }\n }\n }\n ]\n}\n<\/code><\/pre>\nThe kicker? The <code>Resource<\/code> ARN for the provisioned model doesn’t follow the same pattern as the foundation model. If you miss one character, the “aws ai” SDK just returns a generic <code>AccessDeniedException<\/code> with zero hint about whether it\u2019s the IAM role, the KMS key (oh yeah, we had to encrypt the inputs), or the VPC endpoint policy. We were flying blind in a storm of our own making.<\/p>\n<h2><\/span>H2: Latency is Not a Suggestion<\/span><\/h2>\nThe “aws ai” advocates love to talk about “near-instantaneous insights.” In reality, calling <code>anthropic.claude-v2:1<\/code> via a Lambda function running <code>Python 3.12.1<\/code> is about as fast as a snail crawling through molasses. <\/p>\nWe were seeing cold starts on the Lambda side of about 800ms, which is fine, whatever. But the actual <code>invoke_model<\/code> call? Even with Provisioned Throughput, we were hitting 2.5 to 5 seconds for simple inference. Our API Gateway has a hard 29-second timeout. When the “aws ai” logic started getting bogged down by large context windows (because the “intelligent” agent decided it needed to read the last 500 lines of syslog for every request), the entire request chain backed up.<\/p>\nThe “aws ai” integration essentially turned our high-throughput event bus into a sequential queue. One slow inference call held up the worker, which held up the SQS consumer, which eventually caused the SQS queue to hit the 14-day retention limit because we couldn’t process messages fast enough. We were paying for “intelligence” and getting a lobotomized turtle in return.<\/p>\n<h2><\/span>H2: The Cost-Per-Token Heart Attack<\/span><\/h2>\nWhile I was trying to fix the 504 errors, I took a look at the Billing Dashboard. I nearly vomited. <\/p>\nThe “aws ai” services charge by the token. Our “visionary” implementation didn’t have any token-limiting logic in the prompt templates. The system was feeding entire JSON blobs into the model. Every time a developer pushed a debug log, the “aws ai” would ingest it, process it, and spit out a “summary” that cost us $0.05. Multiply that by 10,000 events per minute during the surge.<\/p>\nI found a CloudWatch log snippet that showed exactly how we were burning money:<\/p>\n<pre class=\"codehilite\"><code class=\"language-json\">{\n "timestamp": "2024-05-20T04:20:00Z",\n "level": "INFO",\n "message": "Model invocation successful",\n "model_id": "anthropic.claude-v2:1",\n "usage": {\n "input_tokens": 4502,\n "output_tokens": 128,\n "total_tokens": 4630\n },\n "billing_estimate_usd": 0.078,\n "request_id": "req-99-problems-and-ai-is-all-of-them"\n}\n<\/code><\/pre>\nEight cents. For one log line. We were literally burning the company\u2019s runway to have an LLM tell us that “The system is experiencing high load,” which I already knew because my pager was vibrating off the nightstand. The “aws ai” cost-per-token model is a predatory tax on companies that don’t have the sense to use a <code>grep<\/code> command.<\/p>\n<h2><\/span>H2: VPC Endpoints and the PrivateLink Tax<\/span><\/h2>\nBecause we\u2019re in a “highly regulated industry,” we can\u2019t just let our traffic traverse the public internet. We have to use PrivateLink. Setting up the VPC endpoint for “aws ai” (Bedrock) was a nightmare of DNS resolution issues.<\/p>\nWe\u2019re using <code>AWS CLI v2.15.30<\/code>. When you run a command inside the VPC, it should<\/em> resolve to the private IP of the endpoint. But because of a misconfiguration in our DHCP options set\u2014which had been there for years but never mattered until now\u2014the “aws ai” SDK kept trying to hit the public endpoint. <\/p>\nSince we had no NAT Gateway in that specific private subnet (to save costs, ironically), the requests just hung until they timed out. <\/p>\n<pre class=\"codehilite\"><code class=\"language-bash\"># The command that failed for 3 hours\n$ aws bedrock-runtime invoke-model --endpoint-url https:\/\/vpce-0a1b2c3d4e5f6g7h8-xyz.bedrock-runtime.us-east-1.vpce.amazonaws.com ...\n<\/code><\/pre>\nEven after we fixed the DNS, we realized that the VPC endpoint for “aws ai” doesn’t support cross-region requests. Our failover stack in <code>us-west-2<\/code> couldn’t talk to the Bedrock models in <code>us-east-1<\/code> without a complex VPC peering setup that our network team (of one person, who is on vacation) hadn’t approved. So, the “self-healing” infrastructure was actually a “self-destructing” infrastructure if a single region had a hiccup.<\/p>\n<h2><\/span>H2: Provisioned Throughput vs. On-Demand Chaos<\/span><\/h2>\nThe “aws ai” marketing says you can start small with On-Demand and scale up. That is a lie. On-Demand limits for Bedrock are so low they\u2019re practically decorative. We hit the <code>ThrottlingException<\/code> within the first ten minutes of the traffic spike.<\/p>\nSo, we switched to Provisioned Throughput. Do you know how much that costs? You have to commit to a “Model Unit” for either 1 month or 6 months. It\u2019s like buying a mainframe in the 70s just to run a calculator. And you can’t just “scale it up” instantly. Provisioning a new unit takes time. <\/p>\nDuring the 72-hour hell-march, I had to explain to the CFO why we were committing to a $20,000-a-month spend just to get the “aws ai” to stop throwing 429 errors. <\/p>\nThe “aws ai” scaling isn’t elastic; it’s brittle. It\u2019s a glass skyscraper in an earthquake zone. When the load hit, the On-Demand side choked, and the Provisioned side wasn’t large enough to handle the overflow. We were stuck in a “dead zone” where we couldn’t process the backlog, and we couldn’t scale the “intelligence” fast enough to clear it.<\/p>\n<h2><\/span>H2: Lambda Cold Starts and the Python 3.12 Runtime<\/span><\/h2>\nWe decided to use the latest and greatest: <code>Python 3.12.1<\/code>. Surely, the performance improvements would help with the “aws ai” overhead? <\/p>\nWrong. The <code>boto3<\/code> and <code>botocore<\/code> versions required to support the latest “aws ai” features are massive. By the time we bundled the dependencies into a Lambda layer, we were pushing the 250MB unzipped limit. This led to atrocious cold start times. <\/p>\nEvery time the “AI-driven” auto-scaler decided to spin up more Lambda executors, we\u2019d see a spike in 504 errors because the first few requests would time out while the container was still initializing the “aws ai” client.<\/p>\nI spent four hours stripping out unnecessary sub-modules from <code>botocore<\/code> just to get the package size down. I shouldn’t be doing tree-shaking on a Python library at 4 AM just because the “aws ai” SDK is bloated with every single AWS service definition since 2006.<\/p>\nHere\u2019s the snippet of the <code>serverless.yml<\/code> that I eventually had to hack together just to keep the cold starts under control:<\/p>\n<pre class=\"codehilite\"><code class=\"language-yaml\">functions:\n ai-analyzer:\n handler: handler.analyze\n runtime: python3.12\n memorySize: 3008 # Over-provisioning RAM just to get more CPU for faster imports\n timeout: 30\n environment:\n BOTO_CONFIG: \/var\/task\/boto_config\n layers:\n - arn:aws:lambda:us-east-1:123456789012:layer:aws-ai-optimized-sdk:1\n<\/code><\/pre>\nEven with 3GB of RAM, the “aws ai” initialization was sluggish. It\u2019s a fundamental mismatch: Lambda is meant for short-lived, fast-executing tasks. “aws ai” is a heavy, high-latency, state-heavy beast. Putting them together is like trying to put a jet engine on a tricycle.<\/p>\n<h2><\/span>The Agony of the “Black Box” Debugging<\/span><\/h2>\nThe worst part of this entire 72-hour ordeal wasn’t the technical hurdles. It was the lack of visibility. When a standard database query fails, I can look at the execution plan. I can see the locks. I can see the disk I\/O. <\/p>\nWhen an “aws ai” call fails or returns garbage, I have nothing. I have a <code>request_id<\/code> and a prayer. I spent two hours trying to figure out why the model was suddenly returning empty strings. Was it a prompt injection? Was it a safety filter? Was the model having a stroke? <\/p>\nThe “aws ai” logs don’t tell you why<\/em> a model refused to answer. They just give you a <code>finish_reason: \"content_filter\"<\/code>. Which content? Why? No one knows. It\u2019s “proprietary.” <\/p>\nSo there I was, the “Site Reliability Engineer,” responsible for the reliability of a site that depended on a component I couldn’t monitor, couldn’t tune, and couldn’t understand. I was just a glorified plumber trying to fix a leak in a pipe made of “magic.”<\/p>\n<h2><\/span>Lessons Learned (The Hard Way)<\/span><\/h2>\n<ol>\n<li>Stop using “aws ai” for critical path logic.<\/strong> If your system can’t boot without an LLM’s permission, your system is broken by design.<\/li>\n<li>Buy more RAM and run local models.<\/strong> A quantized Llama-3 model running on a beefy EC2 instance with an NVIDIA GPU is more predictable, faster, and cheaper than the “aws ai” token-based circus.<\/li>\n<li>Grep is your friend.<\/strong> You don’t need a multi-billion parameter model to find an <code>ERROR<\/code> string in a log file. Stop over-complicating things.<\/li>\n<li>VPC Endpoints are a hidden tax.<\/strong> Factor in the PrivateLink costs and the DNS headache before you commit to “secure” AI.<\/li>\n<li>Provisioned Throughput is a trap.<\/strong> It\u2019s just a way to lock you into a high monthly spend for a service that should be elastic.<\/li>\n<li>Documentation is a suggestion.<\/strong> The real documentation is in the <code>botocore<\/code> source code on GitHub. Read it, because the AWS docs won’t save you at 3 AM.<\/li>\n<\/ol>\nI\u2019m going to sleep now. If the “aws ai” decides to hallucinate another outage, tell it to fix it itself. I\u2019m out.<\/p>\n“`bash \n$ history -c \n$ logout \n$ exit 1<\/p>\n<h2><\/span>Related Articles<\/span><\/h2>\nExplore more insights and best practices:<\/p>\n<ul>\n<li><a href=\"https:\/\/itsupportwale.com\/blog\/power-saving-dark-mode-in-whatsapp\/\">Power Saving Dark Mode In Whatsapp<\/a><\/li>\n<li><a href=\"https:\/\/itsupportwale.com\/blog\/enable-rcs-message-in-any-android-phone\/\">Enable Rcs Message In Any Android Phone<\/a><\/li>\n<li><a href=\"https:\/\/itsupportwale.com\/blog\/10-essential-javascript-libraries-for-modern-web-dev\/\">10 Essential Javascript Libraries For Modern Web Dev<\/a><\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"$ curl -v -X POST https:\/\/api.internal.production.vortex\/v1\/inference \\ -H “Content-Type: application\/json” \\ -d ‘{“prompt”: “Analyze system logs for anomaly detection”, “max_tokens”: 512}’ Connected to api.internal.production.vortex (10.0.42.11) port 443 (#0) POST \/v1\/inference HTTP\/1.1 Host: api.internal.production.vortex User-Agent: curl\/8.5.0 Accept: \/ Content-Type: application\/json Content-Length: 72 < HTTP\/1.1 504 Gateway Timeout < Content-Type: text\/html < Content-Length: 160 < Connection: keep-alive … <a title=\"AWS AI Guide: Build and Scale Smarter Applications\" class=\"read-more\" href=\"https:\/\/itsupportwale.com\/blog\/aws-ai-guide-build-and-scale-smarter-applications\/\" aria-label=\"Read more on AWS AI Guide: Build and Scale Smarter Applications\">Read more<\/a><\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"class_list":["post-4806","post","type-post","status-publish","format-standard","hentry","category-uncategorized"],"yoast_head":"\n<title>AWS AI Guide: Build and Scale Smarter Applications - ITSupportWale<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/itsupportwale.com\/blog\/aws-ai-guide-build-and-scale-smarter-applications\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"AWS AI Guide: Build and Scale Smarter Applications - ITSupportWale\" \/>\n<meta property=\"og:description\" content=\"$ curl -v -X POST https:\/\/api.internal.production.vortex\/v1\/inference -H “Content-Type: application\/json” -d ‘{“prompt”: “Analyze system logs for anomaly detection”, “max_tokens”: 512}’ Connected to api.internal.production.vortex (10.0.42.11) port 443 (#0) POST \/v1\/inference HTTP\/1.1 Host: api.internal.production.vortex User-Agent: curl\/8.5.0 Accept: \/ Content-Type: application\/json Content-Length: 72 < HTTP\/1.1 504 Gateway Timeout < Content-Type: text\/html < Content-Length: 160 < Connection: keep-alive ... Read more\" \/>\n<meta property=\"og:url\" content=\"https:\/\/itsupportwale.com\/blog\/aws-ai-guide-build-and-scale-smarter-applications\/\" \/>\n<meta property=\"og:site_name\" content=\"ITSupportWale\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/Itsupportwale-298547177495978\" \/>\n<meta property=\"article:published_time\" content=\"2026-06-03T18:53:16+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/itsupportwale.com\/blog\/wp-content\/uploads\/2021\/05\/android-chrome-512x512-1.png\" \/>\n\t<meta property=\"og:image:width\" content=\"512\" \/>\n\t<meta property=\"og:image:height\" content=\"512\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/png\" \/>\n<meta name=\"author\" content=\"Techie\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Techie\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"11 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/itsupportwale.com\/blog\/aws-ai-guide-build-and-scale-smarter-applications\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/itsupportwale.com\/blog\/aws-ai-guide-build-and-scale-smarter-applications\/\"},\"author\":{\"name\":\"Techie\",\"@id\":\"https:\/\/itsupportwale.com\/blog\/#\/schema\/person\/8c5a2b3d36396e0a8fd91ec8242fd46d\"},\"headline\":\"AWS AI Guide: Build and Scale Smarter Applications\",\"datePublished\":\"2026-06-03T18:53:16+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/itsupportwale.com\/blog\/aws-ai-guide-build-and-scale-smarter-applications\/\"},\"wordCount\":1561,\"commentCount\":0,\"publisher\":{\"@id\":\"https:\/\/itsupportwale.com\/blog\/#organization\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\/\/itsupportwale.com\/blog\/aws-ai-guide-build-and-scale-smarter-applications\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/itsupportwale.com\/blog\/aws-ai-guide-build-and-scale-smarter-applications\/\",\"url\":\"https:\/\/itsupportwale.com\/blog\/aws-ai-guide-build-and-scale-smarter-applications\/\",\"name\":\"AWS AI Guide: Build and Scale Smarter Applications - ITSupportWale\",\"isPartOf\":{\"@id\":\"https:\/\/itsupportwale.com\/blog\/#website\"},\"datePublished\":\"2026-06-03T18:53:16+00:00\",\"breadcrumb\":{\"@id\":\"https:\/\/itsupportwale.com\/blog\/aws-ai-guide-build-and-scale-smarter-applications\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/itsupportwale.com\/blog\/aws-ai-guide-build-and-scale-smarter-applications\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/itsupportwale.com\/blog\/aws-ai-guide-build-and-scale-smarter-applications\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/itsupportwale.com\/blog\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"AWS AI Guide: Build and Scale Smarter Applications\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/itsupportwale.com\/blog\/#website\",\"url\":\"https:\/\/itsupportwale.com\/blog\/\",\"name\":\"ITSupportWale\",\"description\":\"Tips, Tricks, Fixed-Errors, Tutorials & Guides\",\"publisher\":{\"@id\":\"https:\/\/itsupportwale.com\/blog\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/itsupportwale.com\/blog\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/itsupportwale.com\/blog\/#organization\",\"name\":\"itsupportwale\",\"url\":\"https:\/\/itsupportwale.com\/blog\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/itsupportwale.com\/blog\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/itsupportwale.com\/blog\/wp-content\/uploads\/2023\/09\/cropped-Logo-trans-without-slogan.png\",\"contentUrl\":\"https:\/\/itsupportwale.com\/blog\/wp-content\/uploads\/2023\/09\/cropped-Logo-trans-without-slogan.png\",\"width\":1119,\"height\":144,\"caption\":\"itsupportwale\"},\"image\":{\"@id\":\"https:\/\/itsupportwale.com\/blog\/#\/schema\/logo\/image\/\"},\"sameAs\":[\"https:\/\/www.facebook.com\/Itsupportwale-298547177495978\"]},{\"@type\":\"Person\",\"@id\":\"https:\/\/itsupportwale.com\/blog\/#\/schema\/person\/8c5a2b3d36396e0a8fd91ec8242fd46d\",\"name\":\"Techie\",\"sameAs\":[\"https:\/\/itsupportwale.com\",\"iswblogadmin\"],\"url\":\"https:\/\/itsupportwale.com\/blog\/author\/iswblogadmin\/\"}]}<\/script>\n","yoast_head_json":{"title":"AWS AI Guide: Build and Scale Smarter Applications - ITSupportWale","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/itsupportwale.com\/blog\/aws-ai-guide-build-and-scale-smarter-applications\/","og_locale":"en_US","og_type":"article","og_title":"AWS AI Guide: Build and Scale Smarter Applications - ITSupportWale","og_description":"$ curl -v -X POST https:\/\/api.internal.production.vortex\/v1\/inference -H “Content-Type: application\/json” -d ‘{“prompt”: “Analyze system logs for anomaly detection”, “max_tokens”: 512}’ Connected to api.internal.production.vortex (10.0.42.11) port 443 (#0) POST \/v1\/inference HTTP\/1.1 Host: api.internal.production.vortex User-Agent: curl\/8.5.0 Accept: \/ Content-Type: application\/json Content-Length: 72 < HTTP\/1.1 504 Gateway Timeout < Content-Type: text\/html < Content-Length: 160 < Connection: keep-alive ... Read more","og_url":"https:\/\/itsupportwale.com\/blog\/aws-ai-guide-build-and-scale-smarter-applications\/","og_site_name":"ITSupportWale","article_publisher":"https:\/\/www.facebook.com\/Itsupportwale-298547177495978","article_published_time":"2026-06-03T18:53:16+00:00","og_image":[{"width":512,"height":512,"url":"https:\/\/itsupportwale.com\/blog\/wp-content\/uploads\/2021\/05\/android-chrome-512x512-1.png","type":"image\/png"}],"author":"Techie","twitter_card":"summary_large_image","twitter_misc":{"Written by":"Techie","Est. reading time":"11 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/itsupportwale.com\/blog\/aws-ai-guide-build-and-scale-smarter-applications\/#article","isPartOf":{"@id":"https:\/\/itsupportwale.com\/blog\/aws-ai-guide-build-and-scale-smarter-applications\/"},"author":{"name":"Techie","@id":"https:\/\/itsupportwale.com\/blog\/#\/schema\/person\/8c5a2b3d36396e0a8fd91ec8242fd46d"},"headline":"AWS AI Guide: Build and Scale Smarter Applications","datePublished":"2026-06-03T18:53:16+00:00","mainEntityOfPage":{"@id":"https:\/\/itsupportwale.com\/blog\/aws-ai-guide-build-and-scale-smarter-applications\/"},"wordCount":1561,"commentCount":0,"publisher":{"@id":"https:\/\/itsupportwale.com\/blog\/#organization"},"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/itsupportwale.com\/blog\/aws-ai-guide-build-and-scale-smarter-applications\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/itsupportwale.com\/blog\/aws-ai-guide-build-and-scale-smarter-applications\/","url":"https:\/\/itsupportwale.com\/blog\/aws-ai-guide-build-and-scale-smarter-applications\/","name":"AWS AI Guide: Build and Scale Smarter Applications - ITSupportWale","isPartOf":{"@id":"https:\/\/itsupportwale.com\/blog\/#website"},"datePublished":"2026-06-03T18:53:16+00:00","breadcrumb":{"@id":"https:\/\/itsupportwale.com\/blog\/aws-ai-guide-build-and-scale-smarter-applications\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/itsupportwale.com\/blog\/aws-ai-guide-build-and-scale-smarter-applications\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/itsupportwale.com\/blog\/aws-ai-guide-build-and-scale-smarter-applications\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/itsupportwale.com\/blog\/"},{"@type":"ListItem","position":2,"name":"AWS AI Guide: Build and Scale Smarter Applications"}]},{"@type":"WebSite","@id":"https:\/\/itsupportwale.com\/blog\/#website","url":"https:\/\/itsupportwale.com\/blog\/","name":"ITSupportWale","description":"Tips, Tricks, Fixed-Errors, Tutorials & Guides","publisher":{"@id":"https:\/\/itsupportwale.com\/blog\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/itsupportwale.com\/blog\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/itsupportwale.com\/blog\/#organization","name":"itsupportwale","url":"https:\/\/itsupportwale.com\/blog\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/itsupportwale.com\/blog\/#\/schema\/logo\/image\/","url":"https:\/\/itsupportwale.com\/blog\/wp-content\/uploads\/2023\/09\/cropped-Logo-trans-without-slogan.png","contentUrl":"https:\/\/itsupportwale.com\/blog\/wp-content\/uploads\/2023\/09\/cropped-Logo-trans-without-slogan.png","width":1119,"height":144,"caption":"itsupportwale"},"image":{"@id":"https:\/\/itsupportwale.com\/blog\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/Itsupportwale-298547177495978"]},{"@type":"Person","@id":"https:\/\/itsupportwale.com\/blog\/#\/schema\/person\/8c5a2b3d36396e0a8fd91ec8242fd46d","name":"Techie","sameAs":["https:\/\/itsupportwale.com","iswblogadmin"],"url":"https:\/\/itsupportwale.com\/blog\/author\/iswblogadmin\/"}]}},"_links":{"self":[{"href":"https:\/\/itsupportwale.com\/blog\/wp-json\/wp\/v2\/posts\/4806","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/itsupportwale.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/itsupportwale.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/itsupportwale.com\/blog\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/itsupportwale.com\/blog\/wp-json\/wp\/v2\/comments?post=4806"}],"version-history":[{"count":0,"href":"https:\/\/itsupportwale.com\/blog\/wp-json\/wp\/v2\/posts\/4806\/revisions"}],"wp:attachment":[{"href":"https:\/\/itsupportwale.com\/blog\/wp-json\/wp\/v2\/media?parent=4806"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/itsupportwale.com\/blog\/wp-json\/wp\/v2\/categories?post=4806"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/itsupportwale.com\/blog\/wp-json\/wp\/v2\/tags?post=4806"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}