Figuring out these errors

I going to start looking at the site errors but frankly know nothing about them. I’ll just start posting here things I find out.

Clicking on the link about the errors discourse shows up top:

Based on order criticality, the erros first. They are all about access to port 25
I assume the connection refused errors are good based on the fix steve did to stop us having an open relay. Looking at the bottom one specifically:

11:09 am
667
Job exception: Connection refused - connect(2) for "172.17.0.1" port 25
1:03 pm
400379
Job exception: Connection refused - connect(2) for "172.17.0.1" port 25
1:04 pm
/usr/local/lib/ruby/2.6.0/net/smtp.rb:539:in `initialize'
/usr/local/lib/ruby/2.6.0/net/smtp.rb:539:in `open'
/usr/local/lib/ruby/2.6.0/net/smtp.rb:539:in `tcp_socket'
/usr/local/lib/ruby/2.6.0/net/smtp.rb:549:in `block in do_start'
/usr/local/lib/ruby/2.6.0/timeout.rb:93:in `block in timeout'
/usr/local/lib/ruby/2.6.0/timeout.rb:103:in `timeout'
/usr/local/lib/ruby/2.6.0/net/smtp.rb:548:in `do_start'
/usr/local/lib/ruby/2.6.0/net/smtp.rb:518:in `start'
/var/www/discourse/vendor/bundle/ruby/2.6.0/gems/mail-2.7.1/lib/mail/network/delivery_methods/smtp.rb:109:in `start_smtp_session'
/var/www/discourse/vendor/bundle/ruby/2.6.0/gems/mail-2.7.1/lib/mail/network/delivery_methods/smtp.rb:100:in `deliver!'
/var/www/discourse/vendor/bundle/ruby/2.6.0/gems/mail-2.7.1/lib/mail/message.rb:2159:in `do_delivery'
/var/www/discourse/vendor/bundle/ruby/2.6.0/gems/mail-2.7.1/lib/mail/message.rb:260:in `block in deliver'
/var/www/discourse/vendor/bundle/ruby/2.6.0/gems/actionmailer-5.2.3/lib/action_mailer/base.rb:560:in `block in deliver_mail'
/var/www/discourse/vendor/bundle/ruby/2.6.0/gems/activesupport-5.2.3/lib/active_support/notifications.rb:168:in `block in instrument'
/var/www/discourse/vendor/bundle/ruby/2.6.0/gems/activesupport-5.2.3/lib/active_support/notifications/instrumenter.rb:23:in `instrument'
/var/www/discourse/vendor/bundle/ruby/2.6.0/gems/activesupport-5.2.3/lib/active_support/notifications.rb:168:in `instrument'
/var/www/discourse/vendor/bundle/ruby/2.6.0/gems/actionmailer-5.2.3/lib/action_mailer/base.rb:558:in `deliver_mail'
/var/www/discourse/vendor/bundle/ruby/2.6.0/gems/mail-2.7.1/lib/mail/message.rb:260:in `deliver'
/var/www/discourse/vendor/bundle/ruby/2.6.0/gems/actionmailer-5.2.3/lib/action_mailer/message_delivery.rb:114:in `block in deliver_now'
/var/www/discourse/vendor/bundle/ruby/2.6.0/gems/actionmailer-5.2.3/lib/action_mailer/rescuable.rb:17:in `handle_exceptions'
/var/www/discourse/vendor/bundle/ruby/2.6.0/gems/actionmailer-5.2.3/lib/action_mailer/message_delivery.rb:113:in `deliver_now'
/var/www/discourse/lib/email/sender.rb:211:in `send'
/var/www/discourse/app/jobs/regular/user_email.rb:59:in `execute'
/var/www/discourse/app/jobs/base.rb:232:in `block (2 levels) in perform'
/var/www/discourse/vendor/bundle/ruby/2.6.0/gems/rails_multisite-2.0.7/lib/rails_multisite/connection_management.rb:63:in `with_connection'
/var/www/discourse/app/jobs/base.rb:221:in `block in perform'
/var/www/discourse/app/jobs/base.rb:217:in `each'
/var/www/discourse/app/jobs/base.rb:217:in `perform'
/var/www/discourse/vendor/bundle/ruby/2.6.0/gems/sidekiq-5.2.7/lib/sidekiq/processor.rb:192:in `execute_job'
/var/www/discourse/vendor/bundle/ruby/2.6.0/gems/sidekiq-5.2.7/lib/sidekiq/processor.rb:165:in `block (2 levels) in process'
/var/www/discourse/vendor/bundle/ruby/2.6.0/gems/sidekiq-5.2.7/lib/sidekiq/middleware/chain.rb:128:in `block in invoke'
/var/www/discourse/lib/sidekiq/pausable.rb:138:in `call'
/var/www/discourse/vendor/bundle/ruby/2.6.0/gems/sidekiq-5.2.7/lib/sidekiq/middleware/chain.rb:130:in `block in invoke'
/var/www/discourse/vendor/bundle/ruby/2.6.0/gems/sidekiq-5.2.7/lib/sidekiq/middleware/chain.rb:133:in `invoke'
/var/www/discourse/vendor/bundle/ruby/2.6.0/gems/sidekiq-5.2.7/lib/sidekiq/processor.rb:164:in `block in process'
/var/www/discourse/vendor/bundle/ruby/2.6.0/gems/sidekiq-5.2.7/lib/sidekiq/processor.rb:137:in `block (6 levels) in dispatch'
/var/www/discourse/vendor/bundle/ruby/2.6.0/gems/sidekiq-5.2.7/lib/sidekiq/job_retry.rb:109:in `local'
/var/www/discourse/vendor/bundle/ruby/2.6.0/gems/sidekiq-5.2.7/lib/sidekiq/processor.rb:136:in `block (5 levels) in dispatch'
/var/www/discourse/vendor/bundle/ruby/2.6.0/gems/sidekiq-5.2.7/lib/sidekiq.rb:37:in `block in <module:Sidekiq>'
/var/www/discourse/vendor/bundle/ruby/2.6.0/gems/sidekiq-5.2.7/lib/sidekiq/processor.rb:132:in `block (4 levels) in dispatch'
/var/www/discourse/vendor/bundle/ruby/2.6.0/gems/sidekiq-5.2.7/lib/sidekiq/processor.rb:250:in `stats'
/var/www/discourse/vendor/bundle/ruby/2.6.0/gems/sidekiq-5.2.7/lib/sidekiq/processor.rb:127:in `block (3 levels) in dispatch'
/var/www/discourse/vendor/bundle/ruby/2.6.0/gems/sidekiq-5.2.7/lib/sidekiq/job_logger.rb:8:in `call'
/var/www/discourse/vendor/bundle/ruby/2.6.0/gems/sidekiq-5.2.7/lib/sidekiq/processor.rb:126:in `block (2 levels) in dispatch'
/var/www/discourse/vendor/bundle/ruby/2.6.0/gems/sidekiq-5.2.7/lib/sidekiq/job_retry.rb:74:in `global'
/var/www/discourse/vendor/bundle/ruby/2.6.0/gems/sidekiq-5.2.7/lib/sidekiq/processor.rb:125:in `block in dispatch'
/var/www/discourse/vendor/bundle/ruby/2.6.0/gems/sidekiq-5.2.7/lib/sidekiq/logging.rb:48:in `with_context'
/var/www/discourse/vendor/bundle/ruby/2.6.0/gems/sidekiq-5.2.7/lib/sidekiq/logging.rb:42:in `with_job_hash_context'
/var/www/discourse/vendor/bundle/ruby/2.6.0/gems/sidekiq-5.2.7/lib/sidekiq/processor.rb:124:in `dispatch'
/var/www/discourse/vendor/bundle/ruby/2.6.0/gems/sidekiq-5.2.7/lib/sidekiq/processor.rb:163:in `process'
/var/www/discourse/vendor/bundle/ruby/2.6.0/gems/sidekiq-5.2.7/lib/sidekiq/processor.rb:83:in `process_one'
/var/www/discourse/vendor/bundle/ruby/2.6.0/gems/sidekiq-5.2.7/lib/sidekiq/processor.rb:71:in `run'
/var/www/discourse/vendor/bundle/ruby/2.6.0/gems/sidekiq-5.2.7/lib/sidekiq/util.rb:16:in `watchdog'
/var/www/discourse/vendor/bundle/ruby/2.6.0/gems/sidekiq-5.2.7/lib/sidekiq/util.rb:25:in `block in safe_thread'

so it looks like the sideqik errors and mail erros are possibly related.

I have no idea what sideqik is, to google we go:


background jobs manager for ruby.

If I had to guess, sidqik wraps the smtp job.

I’m thinking these smtp jobs are queue’d up somewhere, there’s 400K errors reported when i hit the page. Unless discuss is being targetted to send 400K messages right then? seems possible but i dont really know. I dont’ really know how to check like incoming network traffic.

I’m going to look into that. I’m going to go hunt the logs for more information about the smtp mailing stuff. first discourse, then the systemlogs.

ok, production.log was quite large, so i looked at it:
(apparently i can’t copy/paste terminals easily, ok linux, whatever)

Seems like it’s the digest trying to be mailed out. I’m going to disable that termporarily. some how.

here

led to here:

so apparentely there’s a settings somewhere.

the commit shows this setting:
image

so I have to find out where to put that.

ok, well disabled for now.

I took the logs for Aug 2 and manually filtered things down and found the error rate correlated directly with sending out digests, so hopefully stopping the digests will cut down on the error rate going forward. It’s not a solution, but at least can focus on other issues.

I’ve further deisabled discourse from sending emails:

FYI, this is all basically triage to remove error messages, not a single thing I’m doing will help the real issue, which is what steve found. I see where that seetting is, but no idea what it means to change it or not.

If anyone can help get the email server stuff setup correctly, that’d be great.

ok, error rates have subsided.

when we figure out email, the above can be reversed.

well, i finished some other BS eary and this is nagging me.

mail-receiver.yml is the container config for discourse mail.

rando@ssdnodes-05208:/var/discourse/containers$ ls -ltra
total 32
-rw-r--r--  1 root root    0 Dec  6  2018 .gitkeep
-rw-r--r--  1 root root  299 Dec  8  2018 redis.yml
-rw-r--r--  1 root root 1084 Dec  8  2018 data.yml
-rw-r--r--  1 root root 1491 Jan 30  2019 mail-receiver.yml
-rw-r--r--  1 root root 4295 Mar 27 02:31 app.yml
-rw-r--r--  1 root root  288 Mar 27 02:34 app-0.yml
drwxr-xr-x  2 root root 4096 Jun 29 17:17 .
drwxr-xr-x 11 root root 4096 Aug  2 15:36 ..
##
## After making changes to this file, you MUST rebuild
## /var/discourse/launcher rebuild mail-receiver
##
## BE *VERY* CAREFUL WHEN EDITING!
## YAML FILES ARE SUPER SUPER SENSITIVE TO MISTAKES IN WHITESPACE OR ALIGNMENT!
## visit http://www.yamllint.com/ to validate this file as needed

base_image: discourse/mail-receiver:1.1.2
update_pups: false

expose:
  - "2533:25"   # SMTP

env:
  LANG: en_US.UTF-8

  ## Where e-mail to your forum should be sent.  In general, it's perfectly fine
  ## to use the same domain as the forum itself here.
  MAIL_DOMAIN: discuss.noisebridge.info

  ## The URL of the mail processing endpoint of your Discourse forum.
  ## This is simply your forum's base URL, with `/admin/email/handle_mail`
  ## appended.  Be careful if you're running a subfolder setup -- in that case,
  ## the URL needs to have the subfolder included!
  DISCOURSE_MAIL_ENDPOINT: 'https://discuss.noisebridge.info/admin/email/handle_mail'

  ## The master API key of your Discourse forum.  You can get this from
  ## the "API" tab of your admin panel.
  DISCOURSE_API_KEY: 4fce49b686a2c628c173555e60524c81b64fc6892ac2fc98f716fa04ed8c4cd3

  ## The username to use for processing incoming e-mail.  Unless you have
  ## renamed the `system` user, you should leave this as-is.
  DISCOURSE_API_USERNAME: system

volumes:
  - volume:
      host: /var/discourse/shared/mail-receiver/postfix-spool
      guest: /var/spool/postfx

im pretty sure the expose directive means forward system 2355 to 25 on the container.

from what little docker i know, you then launch the container using either some kind of bridge and expose ports on the bridge, or directly onto the host network, or some other settings. I don’t see anything like that in the discourse launcher, so i guess it just uses the default.

in the app.yml (discoure’s container) there is the following defined:

  ## TODO: The SMTP mail server used to validate new accounts and send notifications
  # SMTP ADDRESS, username, and password are required
  # WARNING the char '#' in SMTP password can cause problems!
  DISCOURSE_SMTP_ADDRESS: 172.17.0.1
  #DISCOURSE_SMTP_PORT: 587
  DISCOURSE_SMTP_USER_NAME: discourse@noisebridge.info
  DISCOURSE_SMTP_PASSWORD: discourse
  DISCOURSE_SMTP_ENABLE_START_TLS: false # (optional, default true)

Discourse should start with these env vars I guess.

Does this mean that discourse will try to log in to the relay with these parameters, or does this mean the relay expects these parameters? I guess the former? I’m not clear where discourse configures the relay so im left to search for where these parameters are used.

Please try re-enabling mail; I think our sheeit is now solved.

I’d re-enable it myself but apparently I’m not a Discourse admin.

Thanks for your help, i re-enabled.

FWIW I’ve been under the assumption that anyone with root can self-escalate as needed. I admin’d you from the cli.

1 Like

I tried to admin him, but it asked me to confirm via email. :sweat_smile:

yeah, catch-22. My discovery into contemporary multi-user applications (web apps? idk) is the are providing tools like discourse’s launcer, phabricator’s phd and more which make configuration slightly simplier than “before”.

so i guess a paradigm to look out for is that these apps we’re installing have cli tools which we can look for to handle situation where the app itself isn’t behaving nicely.

right now, emails are being rejected by receiving servers for reputation reasons. I didn’t get the admin approval email i sent when trying to make Steve admin fromm the UI, and the logs show it was rejected by gmail.

I’ll leave that to others.

Try again, then let’s find a way to look at the email that Postfix is trying to send me?

EDIT: (Then I can click the verification link in that email without having to actually receive said email via email :smile: )

actually the email goes to me, i guess it’s like "Is the person whose email is attached to this account click “make admin?”?

I’ll revoke and do it again.

in /var/discourse/shared/standalone/logs/rails/production.log
image

No error this time about connection refused in the log, which is expected, but still a good sign.

I’m not sure how to check the email server side, but nothing in gmail for me (incl. spam and other auto-labels)

@elimisteve do you have suggestions for the next steps?

I made you admin again, you can send test emails through the UI if you want.

Your discourse burger menu should now have an “admin” section.
image

from there you can use the email test thing at the bottom of this Emails/Settings:

or check the Advanced Test stuff, etc.

Next steps regarding being able to send emails without them going into people’s spam folders, or our emails being rejected completely? I’m not sure what steps the global email blacklist powers that be want us to take in order to prove that we’re no longer spammers; we should research that.